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AUTHOR'S PEEFACE 

(^I undertook for the last German congress of psy- 
chology, held at Berlin, April, 1912, a general review 
of the psychological methods of testing intelligence. 
As I had only an hour at my disposal in my address, 
I could at that time do little more than outline cer- 
tain of the main features of this very broad field. It 
seemed to me, however, hardly desirable to publish 
the address in the form in which it was given. I 
felt, on the contrary, that in view of the now ever- 
increasing interest displayed in the theme both in 
Germany and elsewhere and in view of the extraordi- 
narily scattered nature of the literature — ^much of 
which, by the way, is difficult of access — that an ex- 
position of the topic on a wider scale was demanded. 
So I have tried to elaborate my original review to 
this larger scale. I have treated in it three main 
topics: single tests, the serial method (after Binet- 
Simon) and the methods of correlation and estima- 
tion. 

(in the form of my treatment, also, I have over- 
stepped the bounds of the mere ' ' general review. ' ' I 
have not confined myself to setting down what now 
exists, but have myself taken an attitude toward the 
problem, have offered criticisms of the methods and 
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made proposals for their modification and develop- 
ment. In making these criticisms and suggestions I 
have been able to use the experience that has come 
from the tests of intelligence which have been in 
progress at Breslau for some years past. Many of 
these experiments, in which psychologists, educators 
and physicians have cooperated in a gratifying 
manner, have already been published; others are 
still in progress.) Yet, thanks to the courtesy of these 
workers, I am able to make a preliminary report of 
some of these as yet unfinished investigations. I 
have also taken the opportunity to incorporate some 
minor contributions to the problem that have origi- 
nated in the exercises of the Psychological Seminary 
at Breslau. 

The subject under discussion is limited to some 
extent by the circumstance that tests of intelligenoe 
have been almost always restricted to children and 
youths. But it is just the peculiarity of the psycho- 
logical methods of intelligence testing — ^psycholog- 
ical in the narrower sense, ia contrast, e. g., to the 
psychiatrical methods — that they take their start 
from the mental life of the child, though later, of 
course, the attempt is made to carry them over into 
test methods for adults. On this accoimt I have 
treated in some detail the results that accrue to peda- 
gogy, and not only to the pedagogy of auxiliary 
classes and of the subnormal child, but also to the 
pedagogy of the normal child. 

In my judgment, intelligence testing is one of the 
most promising fields of applied psychology, using 
that term in the strictest sense. For this reason I 
wanted to make this survey of it accessible to wider 
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circles of readers outside the psychological profes- 
sion, especially to teachers of normal and of back- 
ward children, to school administrative anthorities, 
to school physicians, to specialists in nervous and in 
children's diseases, and to those engaged in child 
welfare work. This special edition, accordingly, has 
been arranged. I hope that it will demonstrate to 
the workers in these circles the great importance and 
f ruitfuhiess of the psychologist 's methods and at the 
same time show them the difficulties and the gaps in 
the present status of this work, and that so plainly 
as to prevent overhasty attempts at practical appli- 
cation. 

W. Steen. 
Breslau, October, 1912. 



TRANSLATOR'S PREFACE 

This translation of Stern's Die psychologischen 
Methoden der Intelligenzpriifung has been under- 
taken because the monograph, though dealing with a 
different topic, aims, like my previous translation of 
Offner's Mental Fatigue, to collate, systematize and 
appraise a mass of scattered and to most readers in- 
accessible material that bears upon a problem of un- 
questioned importance. 

Professor Stern was one of the pioneers and most 
active expositors of the investigation of the psychol- 
ogy of testimony, for the furtherance of which he in- 
stituted a new periodical, Beitrdge zur Psychologie 
der Aussage, which was later enlarged to cover the 
wider field of applied psychology in general {Zeit- 
schrift fiir angeivandte Psychologie). Stern is like- 
wise well-known for his contributions to individual 
psychology, notably for his important work on indi- 
vidual differences ( Ueher Psychologie der individuel- 
len Differenzen), published originally in 1900 and 
completely rewritten in 1911 under the title. Die dif- 
ferentielle Psychologie, and for his numerous sig- 
nificant contributions to the psychology of childhood. 
From his Psychological Seminary at Breslau have 
appeared many researches, some of which are re- 
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ported for the first time in the present monograph. 
In conjunction with Lipmann he has also founded the 
Institut fur angewandte Psychologie, which aims to 
serve as a museum and clearing house for the col- 
lection and dissemination of methods and materials 
for studying and recording the mental processes of 
individuals and for facilitating the application of 
psychology to various practical problems. 

What Stern has aimed to do in the present mono- 
graph is sufficiently set forth in his own preface, but 
it may be added here that his book affords what is, 
so far as I know, the best, and in fact almost the only 
authoritative, critical and compact general survey 
of the literature of intelligence testing which is 
adapted for lay readers as well as for professional 
psychologists. 

In perfecting this translation I have received much 
valuable aid from the members of my class in Ger- 
man Educational Psychology, in which the mono- 
graph was used as a text, and from my colleagues, 
Professor P. E. Pope, of the German Department, 
and Mr. D. K. Fraser, assistant in Educational Psy- 
chology. 

Guy Montbose Whipple. 

Cornell University, January 1st, 1914. 
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INTEODUCTION 
Nature and Problem of Intelligence Testing 

1. Intelligence and Intelligence Testing 

Modern experimental psychology, which started 
with the study of sense-perception and then under- 
took that of ideas and feelings, has in the last decade 
begun to deal with intellectual functions themselves. 
And it is worthy of note that general theoretical psy- 
chology and differential applied psychology took this 
step forward at the same time, though for the most 
part independently. In the former there was devel- 
oped a psychology of thinking, in the latter there 
appeared the investigation of differences in intelli- 
gence. 

Our discussion must be restricted to the second 
problem with which alone we are concerned. To the 
other branch of psychology we may confidently leave 
the question of the general nature of intellectual ac- 
tivity and the investigation of the phenomena that 
constitute thinking as such. What we are interested 
in is not intelligence as a phenomenon, but intelli- 
gence as a capacity and particularly a capacity with 
respect to which men differ one from another. And 
intelligence testing is the determination of the de-. 
gree of this capacity in a given individual. 
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The objection is often made that the problem of 
intellectual diagnosis can in no -way be successfnlly 
dealt with until we have exact knowledge of the gen- 
eral nature of intelligence itself. But this objection 
does not seem to me pertinent. In science there is no 
such precise sequence of the different research prob- 
lems. We measiire electro-motive force without 
knowing what electricity is, and we diagnose with 
very delicate test methods many diseases the real 
nature of which we know as yet very little. Indeed,, 
j it may be asserted, quite on the contrary, that prog- 
ress in testing intelligence may shed light from a new 
angle upon the theoretical study of intelligence and 
thus supplement the psychology of thinking in a 
valuable manner. If it turns out, for instance, that 
certain symptoms are relevant and others irrelevant 
for the differentiation of the intelligence shown by 
different persons ; if, again, one series of these symp- 
toms exhibit a high degree, another series a less de- 
gree of intercorrelation, then our knowledge of the 
structure of intelligence must thereby be little by lit- 
tle increased, and thus there will develop a fruitful 
reciprocity between the two phases of investigation, 
theoretical and applied. 

Naturally, we cannot begin our work without a pre- 
liminary definition of intelligence, however pro- 
visional it may be. And this definition must be 
neither too broad nor too narrow. 

Many psychiatrists have used a definition of intel- 
ligence that is too broad. They use intelligence, in 
fact, to include mental attainments of all kinds, all 
those mental qualities, then, that are not volitional 
or emotional. If this position be taken, it follows. 
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evidently that the examination of immediate mem- 
ory, of ability to learn, of range of information, of 
fidelity of report, or of discriminative sensitivity is 
just as much a constituent part of intelligence testing 
as the examination of ability to apprehend, to syn- 
thetize, of capacity to judge, to conclude, to defime, 
to criticize, etc. Again, a question that is very im- 
portant for us, viz. : to what extent intelligence really 
enters into these first-named activities, and whether 
and in what way it shows signs of its presence in 
them, becomes absurd, ^ut the advance made in the 
recent development of intelligence testing, in con- 
trast to the uncritical determination of mental level 
by any sort of questions and tests, consists in the 
fact that we not only limit intelligence by setting it 
over against the~emotive and volitional nature of 
an individual, but also ascribe to it a de&iitely re- 
stricted place within the mental functions^ 

This delimitation of the sphere of intelligence that 
is even now essential cannot be effected, in my opin- 
ion, from a phenomenological, but only from a tele- 
ological point of view. In fact, my definition is^tMs : 
rTnteUigence is a general capacity of an individual ' 
consciously to adjust his thinking to new require- 
ments: it is general mental adaptability to new prob- 
lems and conditions of life. 

This definition differentiates intelligence clearly 
from other mental capacities.^ 

The fact that the adjustment is made to the new 
distinguishes intelligence from memory whose fun- 
damental teleological feature is the conservation and 
utilization of conscious contents already given. 

The fact of adaptatioii, again, emphasizes the de- 
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pendence of the performances upon external factors, 
on the problems and demands of life, and thus dis- 
tinguishes intelligence from genius, whose nature is 
to create the new spontaneously. 
, Finally, the fact that the capacity is a general 
capacity distinguishes intelligence from talent the 
characteristic of which is precisely the limitation of 
efficiency to one kind of content. He is intelligent, 
on the contrary, who is able easily to effect mental 
adaptation to new requirements under the most 
varied conditions and in the most varied fields. If 
talent be a material efficiency, intelligence is a formal 
efficiency. 

I trust that these distinctions may serve to lessen 
the confusion that has been current. It is not so long 
ago, indeed, that in psychiatry 'information tests' 
were carried on as 'intelligence tests,' thereby con- 
fusing memory and intelligence. And we often, even 
nowadays, find intelligence and talent confused in 
everyday life. In the school, for instance, a teacher 
of a special subject like mathematics, who perceives 
the special gift of a pupil in that field, may easily 
come to believe without further evidence that this 
pupil has general ability, or in other words, to rate 
him as an intelligent pupil. 

But we should not interpret this delimitation to 
mean the erection of sharply distinct faculties, as in 
the old faculty theory. Intelligence, for instance, 
does not function by itself and memory by itself; 
rather, every operation of memory is more or less 
impregnated with intellectual functions and vice 
versa: the extent of this interconnection can be indi- 
cated only by the correlation of the tested symptoms. 
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But just on account of this composite character of 
every actual mental process it seems to me that the 
definition of intelligence I have given above is indis- 
pensable as a regulative principle for further investi- 
gation : I mean that any sort of perceptive, memorial 
or attentive activity is at the same time an intelli- 
gent activity just in so far as it includes a new adjust- 
ment to new demands. 

We must add one final limitation : we are consid- 
ering only those phases of intelligence testing that 
deal with a scale of degrees. This does not mean to 
minimize in the slightest the importance of qualita- 
tive differences in types of intelHgence (analytic- 
synthetic, objective-subjective, etc.) ; we need only 
refer to the importance of the essay as a means of 
testing for these phases/ But we shall discuss in 
this monograph only those forms of procedure that 
permit us to say of a given person that his intelli- 
gence is of such and such degree. 

As the title of the book indicates, the problem of 
method will be prominent throughout our presenta- 
tion. We can thus best do justice to the present 
status of the question, for the significance of the re- 
sults thus far obtained lies particularly in the fact 
that they serve to provide new suggestions for the 
perfecting of our methods. 

2. Practical Problems of Intelligence Testing 
Since we have to do here not with methods de- 
signed for purely theoretical investigations, but with 

^On this aspect of intelligence, consult the general review and 
bibliography given in my earlier discussion (1 : pp. 203-213, 433-4). 

[Note : numbers in parentheses refer, unless otherwise indicated, 
to the reference list at the end of this monograph.— Translator.] 
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methods that are to be employed in daily life, their 
form is determined, at least in part, by the practical 
needs that are to be satisfied by intelligence testing. 
We must distinguish four groups that arise from the 
combination of the two pairs of terms : abnormal and 
normal, adult and child.^ 

(a) Adult, abnormal individuals form the chief 
material of the psychiatrists, who in consequence 
were the first to want to test intelligence.^ Not only 
have they invented single methods, but they have also 
devised whole series or systems of examination 
(Eieger, Kraepelin, Sommer, Ziehen, Gregor, Bern- 
stein, Rossolimo, et al.) The contents of these sys- 
tems are such as to bring them only partially within 
our scope; by far the greater portion of them take 
on the character of questions and qualitative tests 
rather than that of quantitatively gradable tests; 
even where these latter have been used, comparative 
material for normal persons is often enough want- 
ing. Whether the outcome of any one of these tests 
might really indicate an abnormally weak intelligence 
was frequently judged on the basis of a preconceived 
opinion as to how normal men might be expected to 
react to the test in question. In recent years this 
has been remedied. Eodenwald (22) showed with 
regard to a group of information tests how much of 
what had a priori been deemed abnormal really lay 
within the bounds of normality. Many psychiatrists 
have sought to obtain comparative standards for 

''A similar division is used by Meumann (15), though he, to be 
sure, defines intelligence somewhat more broadly than do we. 

'An extensive general summary of the more Important methods 
of intelligence testing used by alienists will be found in Jaspers 
(12). 
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their methods by extensive application of them to 
normal persons (Sommer, 26; Ziehen, 30; Eansch- 
burg; Eossolimo, 23-25). Others have turned to ac- 
count the fact that certain methods had already- 
been tried out extensively by psychologists upon 
normal persons, e. g., Ebbinghaus' completion 
method, the report experiment.* But how far all this 
comes from meeting the need of the alienist himself 
is shown by the decision of the International Con- 
gress of Physicians to turn to the psychologists in 
order to secure normal series for the various psy- 
chiatrical tests of intelligence. This task has been, 
undertaken by the Institute for Applied Psychology 

(&) Abnormal children have become, just in the 
last few decades, a center of pedagogical, socio-poht- 
ical, and medical interest. The whole pedagogy of 
the subnormal, the schema of auxiliary schools and 
special classes, the juvenile court and the various 
protective and corrective institutions are, indeed, 
matters of very recent development, but they are de- 
manding a more exact study of the individuality of 
the child, both for purposes of mental diagnosis and 
for 'psychotechnic' purposes (training, treatment, 
punishment, etc.). To meet these needs, the determi- 
nation of degree of intelligence is, though not the 
only, at least a most important factor. 

The weaknesses of the psychiatrical methods men- 
tioned above were doubled when these methods were 
applied to these new problems. With adults we knew 
little enough of the normal standard to which the per- 
formances of abnormal subjects were to be com- 



*For a general account of these methods, see the ti-anslator's 
Manual of Mental and Physical Tests, Baltimore, 2(1 ed., 1914. 
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pared, but with children we knew nothing at all. 
What is more, one normal standard is not enough in 
this case ; every age-year must have its own standard. 
The magnitude of a defect of intelligence in a nine- 
year old child can be determined only by comparing 
it with the normal nine-year old intelligence, and so 
with other ages. The consequent demand for the 
creation of normal test-series for each year of child- 
hood was met, as a matter of fact, not from the side of 
psychiatry, but from that of psychology. Alfred 
Binet, with the cooperation of the physician, Simon, 
has created such a graded series of tests; and al- 
though the system as it now stands may be far from 
final, its fundamental conception will retain its per- 
manent value and will doubtless lead us ultimately to 
a completely satisfactory solution. His method has 
already attained international usage. We shall dis- 

r3S it fully in the second part of our treatment, 
(c) Normal children and youths. It is not to be 
supposed, however, that intelligence testing of 
normal children has merely the secondary import- 
ance of supplying standards of comparison for in- 
vestigations of the feeble-minded. On the contrary, 
the gradation of intelligence within the range of 
normality is an entirely independent problem that is 
closely connected with practical pedagogical inter- 
ests. The ordinary school examinations afford a 
notion of the pupil's knowledge and of his external 
accomplishments, but they do not afford an index 
of his inner endowment, of his mental maturity and 
power; it is here that psychological tests must sup- 
plement other forms of examination. This need is 
especially evident at entrance examinations, but it 
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exists within the ordinary administration of the 
school as well, for the demand, nowadays so em- 
phatically voiced, that instruction shall be individ- 
ualized to the fullest possible extent, presupposes 
a fuller insight into the nature of individualities. 
Very recently, in fact, serious attempts have been 
made to make divisions into classes and sections on 
a psychological and qualitative basis (special classes 
for the subnormal, classes for the bacljward, sepa- 
rate classes for the specially gifted, 'parallel' 
classes with normal and minimal courses of instruc- 
tion for pupils of different degrees of ability in par- 
ticular subjects) — attempts that demand, as an in- 
dispensable prerequisite the possibility of very ex- 
act determination of the actual degree of mental 
ability.' 

In this connection we must, of course, guard 
against the danger which is apt to arise of suppos- 
ing that we have grasped the individuality of a pupil 
in its totality when we have tested his intelligence. 
The fact that intelligence can be more easily treated 
quantitatively than can other individual capacities 
must not lead us to overestimate its import. CNever- 
theless, the fact that we can deal with intelhgence by 
itself does serve to disclose the structure of the in- 
dividuality.) We can determine whether a perform- 



°A11 these pedagogical reform-movements that are related to the' 
problem of intelligence were the general subject of discussion at 
the first German Congress for Child Training and Paidology 
(Kongress fiir JugendMldung und Jugendkunde) that it was con- 
ducted by the School Reform Association {Bund fiir Schulreform) 
at Dresden, 1911. The addresses and discussions of this congress 
have been published in separate form (11) : the special problem of 
testing intelligence was discussed in the addresses of Meumann, 
Kramer, and the author. < 
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ance of greater or lesser degree depends on talent 
or on intelligence; we can investigate what degree 
of correspondence exists between tlie experimental 
results and the teachers' judgments of the intelli- 
gence of pupils; we can delimit the extent to which 
general school efficiency is dependent on intelligence 
itself on the one hand and on non-intellectual factors 
on the other hand — a delimitation that, as will be 
shown later, forms one of the chief merits of the 
psychological methods. 

The studies of normal children that bear directly 
upon our problem were first carried on by separate 
tests: this method, originated in Germany, has been 
very extensively employed and further developed in 
France and especially in America. Then arose in 
France Binet's system of tests with age gradations 
that we have already mentioned. England has 
lately joined the movement to good effect by giving 
us the correlation method for use in the more pre- 
cise testing of intelligence (Pearson, Spearman, 
et al.) These three main lines of activity will fur- 
nish the principle of division of our subsequent treat- 
ment. 

{d) Normal adults. Here we find ourselves in 
a realm whose exploitation is entirely in the future, 
for the tests of intelligence thus far administered to 
normal adults have not been undertaken for the 
sake of these persons, but only to get comparative 
standards for abnormal persons. Yet even now new 
developments are to be noted. Miinsterberg shows 
how important an exact knowledge of individuality 
would be for determining choice of a vocation and 
he has already suggested ways in which the voca- 
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tional bureaus that exist in America might arrange 
psychological tests (19, 20). And Captain Meyer 
(17, 18) sees in intelligence testing a method that 
ought to help the recruiting office to keep unfit can- 
didates off the enlistment rolls. 

These last considerations show that the chief em- 
phasis of intelligence testing, which has hitherto 
lain wholly within psychopathology, must in the 
future be shifted disfTiictly^toward normal psy- 
chology : so the labor expended by psychology in se- 
curing a reliable method will benefit not only 
physicians and those concerned in teaching the ab- 
normal, but also jurists, military officials, those con- 
cerned in teaching the normal child and others. 

But just this anticipated extension of the practical 
applicability of intelligence tests necessitates sev- 
eral words of warning. 

(a) We are still in the midst of our preliminary 
work on method. The methods that now prevail — 
and this is true also of the Binet-Simon system — are 
not yet to be regarded as diagnostic canons that ad- 
mit of official prescription. The law passed in New 
Jersey that directs the use of intelligence tests with 
all pupils suspected of backwardness seems on this 
account very premature. So, too, it will be long, 
very long, before we realize the optimistic hope that 
Spearman attaches to the correlation method of test- 
ing intelligence, when he says: "Indeed, it seems 
possible to foresee the day when there will be an an- 
nual official determination of the 'intellectual index' 
of every child in the empire" (Hart-Spearman; 75, 
p. 78). 

(&) It must be understood that tests of intelli- 
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gence are not easy to conduct. Their administra- 
tion demands extended practise, psychological train- 
ing, and a critical mind. Thus, for instance, the 
average teacher, whose work has been with the 
wholly different methods of pedagogical question- 
ing and examining, is very apt to apply psychologi- 
cal tests in those forms in which their value would' 
be positively illusory. If, accordingly, the use of 
tests for practical purposes shall attaia any very| 
large currency, the training of a specially psycholog- 
ically drilled personnel will become a necessity.', 
School psychologists would then take their place side 
by side with the school physicians.® 

What erroneous ideas prevail concerning the ease of conducting 
tests is illustrated, e. g., in the declaration of Captain Meyer that 
in military enlistment tests of intelligence could some day be 
carried on quite mechanically by subalterns. But, as a matter of 
fact, a psychological test is quite a different thing than the de- 
termination of weight or of stature which might very well be 
carried out by minor military officers. 

(c) Psychological Jests must not he overesti- 
mated, as if they were complete and automatically 
operative measures of mind. At most they are the 
p^ychographic minimum that gives us a first orien- 
tatioh~coiicerning individuals about whom nothing 
else is known, and they are of service to complement 
and to render comparable and objectively grad- 
able other observations — psychological, pedagogical, 
medical — not to replace these.'' 

'Similar warnings against the overestimaticm, mechanization 
and diletante employment of tests are to be found in Myers (21), 
Bobertag (40), and also in Binet's last work (37, pp. 155 fif.). 

[Cf. also the translator's Manual of Mental and Physical Tests, 
Oh. 1.] 

"On the demand for school psychologists, see 11, p. 19. 

[The situation in America is discussed by J. E. W. Wallin in 
two interesting papers, Jour, of Bduc. Psychol. 2 : 1911, 121 and 
191. — Translator.] 



I. Single Tests and Series of Tests 

1. Single Tests 

All psychological experiments may be divided, ac- 
cording to their problem, into research experiments 
and test experiments. The latter are now generally 
known as "tests;" their aim is "to determine for a 
given individual his mental constitution or person- 
ality or to determine a single one of his mental 
traits/" Tests include, of course, not only experi- 
ments in the narrower meaning of an investigation 
carried out with the aid of instruments, but also 
simple methods of procedure that do not involve the 
use of instruments — questions, problems, presenta- 
tion of pictures, and the like — provided that these 
are administered in a systematic and scientifically 
regulated manner and that their results are re- 
corded. 

Now, in no field have so many tests been proposed 
and put into operation as in the field of intelligence 
testing. To give a complete exposition of all these 
test methods and of the results that have been gained 
through them would exceed the bounds of this 
monograph. But this is not necessary, after all, be- 
cause, as will be shown in a moment, the funda- 
mental significance of our whole problem lies not in 



'See my earlier text (1, p. 87). 

13 
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the single tests, but in the construction of "well-con- 
sidered systems of tests, for which single tests 
merely supply the raw material. So we shall con- 
tent ourselves in this part of our essay with a cur- 
sory survey without any pretense at all to complete- 
ness.^ 

The varied nature of the proposals and test inves- 
tigations thus far made is due to the fact that the 
same problem has been approached in very different 
ways. 

(a) For a long time we started from the errone- 
ous presupposition that any psychological method of 
experimentation would be really usable as a test. 
It was thought that all that was necessary was to 
alter the direction, so to speak, of the plan of in- 
vestigation. When a large number of measurements 
had been secured by a single method on a few per- 
sons in the laboratory, the same method was ap- 
plied to many persons, but only once or a few times 
to each of them. If it turned out from such a mass 
experiment that the more intelligent persons ob- 
tained, all things considered, better average scores 

^For all the literature on single tests, see my text on differential 
psychology ( 1, 426 ff. ) ; also in Appendix II of that book there is a 
survey of the relation of the single tests to school performance. 
Fifty-four different tests, with numerous sub-types are described, 
together with their methods and chief results, in Whipple's 
Manual (28). A very large collection of materials for testing was 
exhibited by the Institute for Applied Psychology at the Berlin 
Congress, Easter, 1912, for information about which Lipmann's 
catalog in the report of the Congress may be consulted. Since 
the meeting, this exhibit has been made a permanent one and has 
been assigned a room in the exhibition by the Prussian Ministry of 
Education of German material for instruction, at Berlin, 126 
Friedrichstrasse. The exhibit can be seen at that place by pre- 
vious appointment with the Secretary of the Institute (Dr. Lip- 
mann. Telephone Potsdam, No. 8). 
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than the less intelligent, then it was assumed that 
the method would answer for testing intelligence. 

Nearly all the methods that were familiar to the 
psychological experimenter have been tested out in 
this way, especially in the earlier periods of ia- 
vestigation by mental tests, e. g., measurements of 
reaction-time, determinations of the threshold of 
differential sensitivity in the different modalities, 
optical illusions, experiments on motor skill or 
strength, association experiments, tachistoscopic ex- 
periments, learning of syllables, etc. In some cases, 
it is true, numerous results of interest were secured, 
but it must be admitted that a good deal of energy 
has been expended to little avail in these experi- 
ments. 

{h) A significant advance was made when it was 
finally recognized that this blind probing about 
could not lead us farther, that, on the contrary, tests 
of intelligence must be definitely selected on the 
basis of certain presuppositions that were to be 
made concerning the nature of intelligence. Investi- 
gators, therefore, sought then for exact methods of 
experimentation that would bring intelligence into 
direct and manifest operation. To be sure, the prob- 
lem was at first conceived of in a still too simple 
form, in that intelligence was thought to be exhibited 
as a definite clean-cut mental phenomenon and the 
plan of testing was directed to the examination of 
this assumed special phenomenon. 

The best-known instance of this is the so-called 
'combination method' of Ebbinghaus, now better 
designated'as the 'completion method' (5). In Eb- 
binghaus' view, every true instance of intellectual 
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ability may be reduced in the last analysis to an act 
of 'combining' i. e., to a process of synthetizing con- 
scious contents that previously had been present 
separately ; accordingly, he invented that method in 
which the subject of the test is to supply the correct 
connections between the separated parts of a text in 
which gaps have been introduced. 

This principle of combination or completion has 
been used by many other investigators as a basis 
for various forms of test. 

Thus, Ries (78) uised two tests to measure the ability to bring 
two terms into a logical relation : A : Pairs of words were pre- 
sented that had a logical connection, e. g., flre-smoke, flood-need; 
then a test was made whether the naming of the first member of 
the pair reinstated the second by dint of the logical connection. 
B. Single words were given to which such words were to be ad- 
joined as would form a causally connected pair. A similar method 
is that of Winteler in which a term is to be named that is super- 
ordinate, sub-ordinate or co-ordinate to the word given. 

The combination test of Masselon in which a meaningful sen- 
tence is to be made from three given words has been extensively 
used. Eecently, Meumann (16) has elaborated this method in a 
special fashion ; he presents words so chosen that they can be 
joined in a sentence either in a banal and logically rather crude 
way or in a logically pertinent way, e. g., ass, ilows; poor solution 
"The ass receives blows." Good solution "The lazy ass receives 
blows." The tendency toward the former or the latter rendition 
is taken as an index of intelligence. 

Heilbronner's picture-test (8, 27) examines ability to complete 
in the sphere of vision: the outline of an object is shown on a 
series of small cards and in such a way that there is a pro- 
gressive development from an initial very fragmentary outline by 
successively more detailed stages up to a complete picture of the 
object. The idea Is to find out at what stage of incomplete delinea- 
tion the object can be recognized. 

To this class of tests belongs also the fitting together of cut up 
pictures (method of the Russian alienists, Bernstein and Rosso- 
limo). 

Other psychologists, however, have considered 
other and quite different mental functions to be the 
touch-stone of intelligence. 
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Thus, in an earlier stage of bis work Binet (2) be- 
lieved that the essence of intelligence was"^ capacity 
to -adjust attention : for this reason he used tests of 
attention, like the cancellation of letters in a speci- 
fied text (the Bourdon test), the copying of sentences, 
the esthesiometer (Biaet regarded the discrimina- 
tion of two near-lying compass-points as a phenom- 
enon of attention, not of sensation), the sorting 
of cards containing the alphabet, or numbers, etc. 
In the work of Mgumann (14) we note at times the 
laying of a certain one-sided emphasis on the imder- 
standing of the abstract as being the root of intelli- 
gence. This was why he specially recommended the 
use in testing of the retention of abstract words. 
Quite a number of investigators have directed their 
attention particularly to capacity to apprehend as 
the index of intelligence, and hence have preferred 
to use for tests such things as the apprehension of 
pictures or ability to perceive linguistic material of 
different contents and extents. 

(c) We may consider as a third main class of 
tests those patterned after familiar pedagogical 
tasks. There are, indeed, certain school activities 
that admit of relatively precise grading, since they 
can be rated both in terms of quantity (amount done 
within a given time) and in terms of quality (fre- 
quency of mistakes). Those schoolroom tasks are 
most obviously adaptable for psychological pur- 
poses within which the course of activity is fairly 
homogeneous, e. g., the computation of specified 
arithmetical problems, writing from dictation, com- 
mitting to memory of vocabularies and poems, and 
all these tasks have, in fact, been used for testing in- 
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telligence. J]vidently a chief objection to this 
method is that the activities mentioned are depen- 
dent to a large degree upon external conditions of 
the instruction, so that the intelligence of individ- 
uals that are working, or that have worked, under 
different school conditions cannot be subjected to 
comparative tests by means of these activities. 

(d) A fourth main class of tests is still farther 
removed from the precision of the laboratory ex- 
periment, but is thereby more nearly allied to real 
life. These tests aim to secure records of such evi- 
dences of intelligence as are accepted in ordinary 
life as special evidence of it. These direct tests of 
intellect have been specially developed by the psy- 
chiatrists: they comprise such things as defining, 
comparing, differentiating, the understanding of 
proverbs, grasping the point of a joke, seeing ab- 
surdities in verbal or pictorial presentations. 

These tests have the advantage that in them in- 
telligence is undoubtedly much more directly opera- 
tive than in the others : but on this account it is im- 
possible in most of them to scale the results: they 
are "alternative tests," that admit of but the rough 
differentiation into right or wrong (+ or — ). The 
single test of this sort, therefore, does not make it 
possible to secure any very precise characterization 
of the person tested, or to rank him in a scale. 

2. The Inadequacy of the Single Test 

A critique of all these confusingly many attempts 
might be undertaken by examining them, test by test, 
to see which ones deserve to be recommended as in- 
dicators of intelligence. But we feel that far more 
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important than such, a special scrutiny of single tests 
is the laying of emphasis upon a general critical 
position : no single test, no matter how good it may 
\he, should ever he made the instrument for testing 
the intelligence of an individual} 

Because the single test tests on the one hand more, 
and on the other hand, less than it really ought to 
test. 

More, because the mental activity that is aroused 
in a subject by an experimental task, a test-question, 
or the like, is the fused resultant of quite varied an- 
tecedent conditioning factors: and we do not know 
what share that particular conditioning factor that 
we call intelligence played in the performance. In 
this equivocal nature of the object under investiga- 
tion lies the too often little noted distinction between 
tests and laboratory experiments. If I arrange an 
investigation of memory in the laboratory, I know 
that I am actually examining memory and not some- 
thing else, because in numerous single experiments 
I vary in a measurable way certain conditions only 
of the function of memory while I keep all the other 
conditions constant. But when, on the contrary, I 
administer a test of learning or a test of immediate 
memory by itself to a person, the outcome is affected 
by the real capacity of retention, understanding of 
the material, attention, interest, etc., all without con- 
trol — and this quite regardless of the disposition 
of the subject at the time. Or, take another exam- 



■Cf. Binet (36, p. 201) : "One test has no meaning, but Ave or 
six tests do mean something. * * * The attention of psy- 
chologists must, then, be called especially to this principle of the 
multiplicity of tests," 
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pie : suppose that a subject has made a good record 
in the filling out of gaps in a text (Ebbinghaus' com- 
pletion test), does this good performance depend 
predominantly upon a real capacity for logical com- 
bination? Or upon a specially large vocabulary? 
Or upon a fine feeling for language? Or upon prac- 
tise in guessing riddles ? 

The only way to analyze out from this fused re- 
sultant the ability we are after — in this case, for in- 
stance, the ability to effect combinations — ^is obviously 
to add several more tests of a different kind that 
will also involve the process of combining, but that 
will in addition involve mental processes of quite 
different sorts. Correspondences that may appear 
in the results of these different tests may then be 
ascribed with probability to their common factor — 

-in our example, to the ability to effect combinations. 
The ability sought for must, therefore, be plotted, 

^s it were, from different positionlp 

Too little. But suppose that w^ have succeeded 
in determining a subject's ability to make combina- 
tions not by a single test but by a smaller number of 
different 'combination' tests, have we then meas- 
ured his intelligence? By no means, for we have 
now determined far too little. Intelligence, it is to 
be noted, means an all-round ability; it refers to 
the general mental attitude toward new demands, 
and combining is only one side of this attitude. The 
other sides possess equal significance, e. g., the 
grasping by consciousness of a newly presented ob- 
ject (apprehension, apperception, understanding), 
the dividing of a whole into its parts (analysis), the 
taking of an intellectual attitude toward a content 
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(judging, criticizing, deliberating, and deciding), 
etc. 

These functions of intelligence must, then, be con- 
sidered in their totality; and the actual testing of 
them ought not to be omitted unless we were certain 
that they had already been examined by implication 
along with some other tested function. Suppose 
that in a group of persons it had been possible to 
show that X had the best ability to combine; is it 
then certain that he would also take first place in 
other forms of activity involving intelligence and 
that he might, accordingly, be ranked first in total 
intelligence ? 

To ask this question is enough to insure a nega- 
tive reply. I feel, I admit, that Spearman (75, 77, 
80) is right in asserting that intelligence does really 
signify a general capacity which colors in a definite 
way the whole mental behavior of an individual. 
But we must not force this idea — nor does Spear- 1 
man — so far as to assume that all the separate con- 
stituent functions of intelligence in the different 
fields are mechanically of equivalent degree. Such 
a Adew is, indeed, contradicted by the circumstance 
that there is operative in each individual bit of be- 
havior not only a given quantity of intelligence, but 
also the special quality of intelligence of the person 
tested, and besides these a varied number of othef 
mental traits. Thus, there are persons who have a 
pretty high grade of general intelligence, but who 
manifest it much better in analytic and critical than 
in synthetic work ; again, there are persons in whom 
the receptive activities of intelligence (apprehend- 
ing and understanding) are superior to the more 
spontaneous activities, and so on. 



M 
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However, everyday life shows that we can disre- 
gard these qualitative differences and nevertheless 
may characterize the general grade of iatelligence 
that a man possesses. When we do this we make, 
even unconsciously, certain, compensations : two per-t , 
sons may have an intelligence of the same value, but / 
of somewhat different kinds. In tests there must be H 
introduced a kind of systematic compensation like 
this. We must test the different phases of the activ- 
ity of intelligence and seek to construct a general 
picture of the degree of intelligence from the differ- 
ent results, partially accordant, partially variant as 
they will be. 

This has given us a clear idea of what is wanted 
in the methodics of intelligence testing. 

Negatively, it must be declared that the method 
of isolated tests, the idea of basing everything on a 
single test, is methodologically no better than such a 
procedure as judging the total character of a man 
on the strength of the single arbitrarily selected 
S5rmptom of his handwriting (graphology). 
/ Positively, three things are evident: first, series 
' of tests must be arranged that will set in play the 
various constituent functions of intelligence; sec- 
ondly, for this purpose there must be a wise selec- 
tion of tests ; out of the immense number of possible 
tests only those should be chosen that afford a de- 
cided and a reliable symptomatic value, general ap- 
plicability, and possibility of objective evaluation; 
thirdly, there must be created a system by means of 
which the several particular results of the testing 
can be united into one resultant value, i. e., a value 
that shows the grade of intelligence of the subject 
objectively in an inclusive formula in which per- 
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f ormances of different degrees of value shall in some 
way be compensated. 

3. Series of Tests 

The first of these positive requirements has al- 
ready been met for a long time since; in especial, 
since Rieger numerous test series have been used by 
the psychiatrists for testing intelligence. These 
series have been based as a rule upon a psycholog- 
ical schema, though this schema has varied a good 
deal from one investigator to another. For illus- 
tration two such series may be mentioned, both of 
them quite recent. 

Sommer (26), in an article just published on the methods of 
intelligence testing, discusses in order the materials for testing 
the following aspects of the problem : relation of memory, of 
school information, of arithmetical ability and of association to 
understanding, also attention, capacity to apprehend, completeness 
of complexes, analysis of complexes, redintegration of complexes, 
mechanical knowledge (cleverness), constructive knowledge, logi- 
cal subordination and superordination, notion of cause and effect, 
intellectual interests, understanding of the environment. 

Ziehen (30), in the last (3d) edition of his Prinsipien und Meth- 
oden der IntelUgengprufung, makes the following classification : 
retention, development and differentiation of ideas (generalization, 
isolation and complexion of ideas), reproduction and combination, 
and describes the numerous forms of questions and tests used in 
his clinic for each of these divisions. 

Although one cannot deny that these and other 
series devised by psychiatrists are quite compre- 
hensive, yet they are open to criticism in other re- 
spects : the requirements that were laid down above, 
under 6 and c, are met by them only partially or not 
at all. For all these series give the impression that 
the selection of tests may have been more a matter 
of chance or arbitrary choice than something deter- 
mined by actual g-uaging of their value. As a rule 
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the selection was based upon a priori reasoning that 
a certain capacity, e. g., retention or combination, 
which had been assumed to belong to intelligence 
was 'hit' by a certain kind of test. Very seldom was 
any actual preliminary investigation made to see 
whether this particular test was really superior to 
so and so many others by virtue of the precision, 
constancy and significance of the particular values 
that it afforded. Moreover, this chance selection evi- 
dently explains why there is so little agreement be- 
tween the test series of different investigators: 
every psycliiatrical clinic has its own special method 
of testing intelligence; every specialist in nervous 
diseases, every physician in charge of classes for 
subnormals chooses his tests to suit his fancy, and 
thus it has been impossible, so far, to effect any real 
comparison, corroboration and standardization of 
the results of different investigators. 

Finally, the usual psychiatrical test-series suffer 
from the lack of any principle by which to summar- 
ize the results in a single value. The psychiatrists 
recognize that it is impossible to set a value on the 
intelligence of a person as a whole, for they apply 
such predicates as "poor in judgment," "mentally 
feeble," "imbecile," "idiotic;" but if we watch the 
way in which, in the individual case, they arrive at 
the general conclusion that they draw from the data 
of their test-series, we note a yawning gap. The 
mosaic of test results is, and remains, only raw ma- 
terial ; no fundamental methodological principle, but 
only intuition, routine and subjective estimation of 
their results, dictates the final decision concerning 
the intelligence of the subject. In a certain sense, 
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there is an advantage in deciding in this way, for the 
gift — ^wellnigh an artistic gift — of intuitive appre- 
ciation and sympathetic understanding is peculiarly 
indispensable to the psychiatrist. But if we leave 
this possibility quite out of account, there remains a 
decided disadvantage, because every conclusion ar- 
rived at by this method then remains a subjective 
one that cannot be controlled or subjected to gener- 
alization. On this account we are justified in de- 
manding that, at least in addition to this intuitive 
diagnosis, there should also be a method for making 
an objective evaluation of the results. To meet this 
demand these mere collocations of tests will have to 
be replaced by a closed system of tests which will 
permit the derivation of a final general index of in- 
telligence from the results obtained from any subject 
whomsoever, and that in accordance with prescribed 
rules that can be applied in a comparable way in all 
places and on men of different grades of intelligence. 

An alienist has come forward lately with an attempt of this sort, 
i. e., an attempt to join together a series of tests systematically so 
as to furnish a 'picture' of an individuality. I refer to the so- 
called 'profile-method' of the Russian, Rossolimo (23-24a) : a 
method that really includes more than mere tests of intelligence 
and comes, therefore, but partially within our scope. 

Rossolimo has contrived ten tests for each of ten different men- 
tal functions. The results obtained from the single subject are 
set out graphically by erecting ordinates corresponding to the 
number of the tests achieved for each of the functions under test. 
The ends of these ordinates are then joined to make a curve that 
Rossolimo calls the 'individual profile.' This profile line is sup- 
posed to furnish a pictorial representation of the total nature of 
a patient. Thus, for instance, in those disorders in which the ca- 
pacity of immediate reproduction is decidedly reduced while the 
other capacities remain unaffected, the profile will show a sharp 
notch at a definite point, and so on. 

The tests proposed by Rossolimo have many commendable feat- 
ures ; we may note, for example, the little puzzles, like the sepa- 
rating of two interlaced wire nooses, etc., that are used to test 
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technical ability. Yet, on the whole, the principle of the con- 
struction of the profile is too superficial and the coordination of 
certain tests to certain mental functions, e. g., to volitional acts, 
is not precise enough to allow us to hope for much success. 

This demand for a system of tests presents such 
an exceedingly difficult scientific problem that it is 
perfectly evident that alienists and educators can 
not solve it as a side issue of their professional work, 
but that psychology itself will have to undertake the 
task. It is interesting in this connection to note how 
psychology attacked the problem along two very dif- 
ferent lines. I feel that it is important to consider 
them separately in what follows. Neither of these 
two lines of effort should be regarded as the only 
correct one ; each method has its advantages and its 
disadvantages, and, what is particularly important, 
each has its special aim for which it is fitted. The 
method of age-gradation of Binet and Simon permits 
of a rough gradation of intelligence for the whole 
range of development of the child ; it is for use in a 
comparable manner with children of different ages, 
of different nationality and cultural level, with 
normal and with feeble-minded children of all 
grades. The method of rank correlation, on the 
other hand, is limited thus far to a comparison of the 
members of a small homogeneous group, but renders 
it possible to test the gradation of intelligence with- 
in this group with a precision that the Binet method 
can not approximate. A considerable amount of ma- 
terial is already available for the first of these 
methods, and we shall have to deal with it at some 
length for that reason. With the second method, on 
the contrary, our discussion will center upon the out- 
look for its future development. 
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Both methods have been tried out so far almost 
exclusively upon school children ; hut it is to he ex- 
pected that they will find use also in testing the in- 
telligence of adults, both normal and mentally de- 
ficient. 



II, The Method of Age-Gradation (Binet-Simon Method^ 

1. The Principle of the Method and the Tests Em- 
ployed in It 

In the nineties Binet and Simon conceived the 
idea of constructing a "graded scale of intelligence" 
{Echelle metrique de V intelligence) that should be 
especially planned for testing the intelligence of 
children. The requirements to be satisfied by the 
method were the following. A series of tests should 
be found for each year of childhood the passing of 
which could be considered normal and typical for 
children of just precisely this age. The tests must 
be relatively uninfluenced by external and chance 
conditions, especially by school learning, so that the 
result might bring out as purely as possible the real 
mental endowment of the child ; they must admit of 
as uniform use as possible in different nations, lan- 
guages or grades of culture : they should be easy to 
carry out, not necessitate laboratory apparatus or 
instruments of precision, should not exact too much 

^Comprehensive descriptions of his method are given by Binet 
(partly in conjunction with Simon) in references 33 to 37. A 
general reviev^ of the development of the method Is given by 
Bobertag (39). [Also recently by Meumann, Arch. f. d. ges. 
Psych., 25: 1912 (Literatur, 85 f£.) .—Translator.] 

29 
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time of the child, should not impose hardship on 
him or tire him, and yet must possess sufficient ac- 
curacy to make possible comparison and checking of 
the investigations undertaken by different persons ; 
and, finally, they should make it possible to "work out 
a final value for each subject tested that could be 
deemed a measure of his general intelligence. 

It seems, at first blush, as if the fulfilling of so 
many different demands would raise insurmountable 
difficulties. Above all, there was no preliminary in- 
formation available as to what intellectual perform- 
ance might be expected, even approximately, from a 
child of a given age. If some time you ask a teacher 
or some one who has been dealing with children of 
different ages for a long time at what age a child 
could be expected to give correctly the difference be- 
tween two designated objects, e. g., wood and glass, 
and at what age he would be able to explain the dif- 
ference between two abstract concepts, e. g., lies and 
mistakes, he would either be silent or make a blind 
guess at it. Here, then, was virgin land to explore. 
When to that is added the conditions that have just 
been stated, many of which are hard to reconcile 
with one another — freedom from school training, 
general ease of application, brevity, precision, possi- 
bility of quantitative evaluation, there can be no 
doubt that there was laid down here one of the hard- 
est problems that applied psychology had set for 
itself up to this time. 

Nevertheless, the difficulty has, in principle, been 
overcome. Of course this does not mean that the 
present form of the method can be regarded as a 
final form : it will doubtless suffer so many modifica- 
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tions in the near future that it will hardly be recog- 
nized in the end. But we know that we are on the 
right track, and in some future decades it can be fully 
appreciated what praise Binet and his co-worker 
Simon have deserved by directing us along this path. 

A short time ago — October 18, 1911 — the gifted 
and highly esteemed creator of the method died. His 
all-too-early demise, that we mourn most bitterly, 
compels others now to pick up the threads that he 
had spun. At such a moment it is appropriate to 
summarize briefly what has been gained and to point 
out the steps that are to be taken for further ad- 
vance. 

After many years of preliminary empirical inves- 
tigation to determine what tests might be considered 
normal for given ages, Binet and Simon published 
(33) in the year 1908, the first complete account of 
their system or tests. It comprised a series of from 
five to seven tests for each age from three to thirteen 
years. A revised draft appeared in 1911 (35, 36) in 
which many tests are modified, many shifted to dif- 
ferent age-years and the number of tests for each 
age-grade brought uniformly to five. The 1911 sys- 
tem replaces tests for 11, 12 and 13-year-olds by 
tests for 13 and 15-year-olds and adults. 

A list of all the investigations conducted on the 
B. S. tests to date is given in the bibliography at the 
end. In the appendix there are brought together in 
comparative form the series of tests proposed for 
each age by Binet and Simon in 1908, and 1911, by 
Bobertag, and by Terman and Childs. 

As a glance at the list of tests shows, almost all of 
them are of the alternative type, i. e., they are tests 
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in which performance can not be gi-aduated, but can 
only be scored right or wrong (-j- or — ). Failure 
to reply at all is counted 'minus' just as much as an 
expressly given wrong answer. It must be admitted 
also that it is often quite hard to decide in a given 
case whether to rate an answer + or — : the only 
way to do this with certainty is to practise for a 
long time and to observe uniformly the criteria that 
have been chosen for the decision. 
The tests are extremely varied in nature. 

Memory is tested, on the one hand as immediate memory for 
digits and sentences of different lengths, for a story that is read, 
and for three simple orders given together, and on the other hand 
as possession of simple everyday knowledge (days of the week, 
months, coins^ right and left). Size and availability of vocabulary 
is determined by the number of words that can be named in three 
minutes. 

Since 1911 a test of suggestibility (judgment of line-lengths) 
has been introduced. 

Motor abilitiea are tested by some tests of drawing from copy, 
paper cutting and writing. Practical accomplishments are in- 
volved in counting coins, making change for a larger coin, exe- 
cuting the three commissions just mentioned. 

Mtost of the tests, however, aim more directly at intellectual 
activities. Comparison and discrimination are dealt with in va- 
rious forms, e. g., sensory comparison (of small boxes of like ap- 
pearance, but unlike weight), logical discrimination from memory, 
both between concrete terms (wood and glass, fly and butterfly) 
and between abstract terms (lies and mistakes) ; esthetic com- 
parison (drawings of beautiful and ugly faces). There are also 
tested defining of both concrete and abstract terms, the completing 
of omissions in a text, the combining of three words into a sen- 
tence, orderly arranging both of sensory material (putting five 
little boxes in order according to their weight), and of logical, 
verbal material (placing jumbled-up words in a sentence) ; the 
intelligent apprehension of a picture ; critical apprehension, both 
optical (noting omissions in drawings of persons), and logical 
(recognizing inconsistencies in certain sentences) ; practical moral 
intelligence (by questions in the form : 'What's the thing to do 
when so-and-so happens?'). 

Many of the tests recur in different age-levels in 
such a way that the standard of performance de- 
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manded is varied. Thus, the pictures are presented 
to subjects of all ages ; enumeration of the pictured 
objects corresponds to the 3-year old level, a descrip- 
tion of the action that the persons are carrying on, 
to the 7-year old level, a comprehension of the total 
meaning of the picture to the 12-year old level. The 
defining of concrete terms appears in the 6 and the 9- 
year stages; in the former, definition in terms of 
use suffices, e. g., "What is a horse?" "To ride;" 
in the latter something superior to this is demanded, 
e. g., "What is a horse?" "An animal." Finally, 
the memory span tests for digits and sentences are 
graded into several classes according to their length; 
thus, . after once hearing the digits, the 3-year old 
child should be able to repeat two, the 4-year three, 
the 7-year five, the 12-year seven digits. 

The individual tests are of unequal value. Many 
are of exceptional merit, e. g., defining, describing 
pictures, answering questions that put a premium 
on intelligence. It is also a very meritorious feature 
that there are tests among them whose solution does 
not depend on readiness in the use of speech, e. g., 
the arrangement of the five weights, esthetic com- 
parison, recognizing omissions in pictures: we are 
as a rule altogether too much inclined to identify 
control of verbal expression with intelligence, an 
inference that is often false. Others of the tests, 
however, are more dependent than we could wish on 
external, particularly on home influences, e. g., know- 
ing coins, or are too much mere functions of pure 
mechanical memory (reciting the days of the week), 
so that it would be better to supplant them by others 
in the future. It must be recognized that any change 
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in the selection and arrangement of these tests pre- 
sents a difficulty of quite another sort than as if they 
were mere collocations of tests: for, since each of 
these tests is a factor in the determination of the 
final score, it is possible that a change may destroy 
the equilibrium of the whole system. This is easy 
to be seen in the supplementary iavestigation of 
Binet and Simon themselves, when they tried to cor- 
rect their system by the omission, insertion and 
transference of particular tests: for trials, e. g., 
those of Terman and Childs and of Chotzen, have 
shown that the second edition (1911) is in many re- 
spects less useful than the earlier form (1908). 

What remedies can be devised for this situation 
will be discussed below (Section 5a). 

The technique of the Binet-Simon method is by no 
means~sc)"ea;Byas the simplicity of the material used 
would lead one at first to suppose. It is to be recom- 
mended that, so far as is in any way feasible, the ex- 
aminer should always do his work with the aid of an 
assistant to keep the record, so as to avoid the un- 
desirable division of attention between testing and 
recording. Both of these experimenters must have 
gained a high degree of practise and be well used to 
one another before they proceed to actual testing. 
The examiner must have an almost mechanical exact- 
ness and uniformity in the formulation of the con- 
tinually recurring questions, in the modulation of 
his voice, etc., yet he must be prepared for the many 
individual variations that appear in consequence of 
different reactions of the subjects, and must have 
definite measures in readiness for use in these junc- 
tures. Never must he permit it to be seen that some 
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answers are more, others less satisfactory to him: 
rather must he maintain an attitude of uniform and 
quiet friendliness. The recorder should not confine 
himself to the mere noting of plus and minus signs 
to show the net outcome of each test, but should also 
note down as fully as possible what the subject says 
and also such features of his behavior and attitude 
toward the tests as are worth noting. This is neces- 
sary both because it is often impossible to decide 
whether to credit a test 'plus' or 'minus' until later 
on, after quiet consideration (and the material must 
be available for that) and also because it should 
make possible a qualitative analysis of the examinee. 
The individual subject ought, of course, to be 
tested not only with the tests of his age, but also 
with a considerable part of the whole series — on ac- 
count of the area of scattered distribution to be dis- 
cussed in a moment. The examiner should begin 
with tests that are neither too easy nor too difficult, 
avoid monotony and introduce short pauses if fatigue 
becomes noticeable. The testing of a single indi- 
vidual takes, for normals 20 to 30 minutes, accord- 
ing to age and circumstances, for abnormals from 
one-half to three-quarters of an hour, on account of 
the slower response. 

In mass experiments there is a source of difficulty in the possi- 
bility of communication between those already tested and those 
to be tested. It is true that the danger of such a 'psychic infec- 
tion' is not very great, on account of the peculiar character of the 
material used for the testing; nevertheless, one should avoid, as 
far as may be, the possibility of any spreading of information. 
Thus, for instance, it is not advisable to test the pupils of one 
class on several days in succession. If it is desired to examine a 
rather large number of children that belong in the same group, the 
plan followed at Breslau seems useful : four experimenters (with 
their clerks), all of whom had been trained to conduct the tests 
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in the same way, carried on tests the same afternoon in different 
rooms. Each experimenter could deal with four or five subjects 
in this time, and each subject was obliged to go home directly 
after his examination ; in this way, 16 to 20 members of the class 
were tested without there being possible any exchange of Ideas 
between them. 

For all further details of the technique of these 
tests the directions for using them that are already 
available for different nations must be consulted. 

Such directions have been given for the examina- 
tions of French children by Binet and Simon in 1911 
(35, 36), for English and American children by 
Whipple in his Manual (28), by Wallin (67) and more 
briefly by Huey (9), and for Italians by Treves-Saf- 
fiotti (66). For use in Germany Lipmann first fol- 
lowed the original instructions as precisely as possi- 
ble, and then Bobertag (40) described very fully his 
elaboration of them as based on practical tests — 
an elaboration that differs from Binet and Simon to 
advantage in some particulars, e. g., in the choice of 
pictures. The extended directions for testing and 
questioning that Bobertag has prepared should form 
the basis of all future investigations in Germany.^ 

2. TTfie Resultant Values: Mental Age, Mental Re- 
tardation, Advance, and Arrest; Mental Quotient 

We must now pass on to note how the grade of in- 
telligence of a subject can be derived from his per- 
formances in the tests. 

Considering the problem schematically, we might 
think that the grade of intelligence could be ex- 

'The simple set of materials needed for carrying on the German 
tests, after Bobertag (lists of questions, tests of memory span, 
pictures, set of small boxes for weights, etc.), may be had of the 
Institute for Applied Psychology at Klein-Glienicke. 
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pressed by the stage whose tests could just be passed 
by the child: a subject who readily passed all the 
tests up through the 9-year ones, but failed with the 
10-year and subsequent ones, would, accordingly, 
possess a nine-year grade of intelligence. 

But things are never quite so simple in actuality 
as they are in theory. The varying tests of any 
given age-level — ^we may call them a.h.c. d. e, — are 
not all equally difficult for all children, but there are, 
on the contrary, quite remarkable individual varia- 
tions. One cMld passes a to d, but fails with e; an- 
other passes a, c and e, but not h and d. This is due 
in part to momentary fluctuations of attention, 
fatigue, etc., that must, of course, always be reckoned 
with, but in part also to qualitative differences in in- 
telligence. The correlation between the different 
phases of intellectual functions is truly never so high 
that a positive accomplishing of test a must neces- 
sarily entail a like accomplishing of the approxi- 
mately 'equally diffcult' tests h, c and d. 

And so it comes about that there is no hard and 
fast boundary between the age-level that a child 
passes completely and the levels that are unquestion- 
ably beyond his powers ; rather is there an interme- 
diate territory of greater or less extent within which 
successes and failures are scattered in irregular 
fashion: we shall call this the area of irregularity 
(Gebiet der Staff elstreuung) . It is impossible to de- 
rive a mean or average value from the data afforded 
by this area without proceeding in a somewhat arbi- 
trary manner, but the formula proposed by Binet 
and Simon seems to have answered very well so far. 
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According to it, one first ascertains up to what 
age-level the tests are passed -without failure (save 
that possible failure with a single test is not counted, 
because such failure may have been due to a momen- 
tary lapse of attention). This age-level is taken as 
the basis, but every five tests passed in levels above 
it are counted as one more year. If, then, a child 
should pass all tests (save a single one) to and in- 
cluding the six-year level and in addition three tests 
each in the 7th, the 8th, and the 9th year and one test 
also in the 10th year, these ten additional tests would 
be counted as two years, and the child would obtain 
for the net value of his intelligence, 6-1-2 years, i. e., 
his intelligence would be rated as that of an 8-year 
old child. 

This net value in terms of which the total intelli- 
gence of the subject is graded has, therefore, the sig- 
nificance of an age-designation : it indicates that the 
intelligence of the child tested is equivalent to the 
average intelligence of the children of the age stated. 
We thus arrive at the concept of mental age {Intel- 
ligenzalter, niveau intellectuel) , which is the cardinal 
feature of the method of graded tests. 

Now mental age must not, of course, be thought 
of as an absolutely unequivocal determination of a 
subject 's intelligence, but only as a very rough quan- 
titative characterization of its value, without any 
implication as to qualitative differences, because one 
'< and the same mental age can be figured from the 
most varied sorts of distribution of passed and failed 
tests. But this very thing appears to constitute an 
advantage, rather than a disadvantage of the con- 
cept of mental age, for it gives expression to a fun- 
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damental psychological fact (already mentioned 
above) that, on account of the purely formal char- 
acter of intelligence and the lack of complete cor- 
relation among its constituent capacities, there never 
is a real phenomenological equivalence between the 
intelligence of two persons: what we do have is 
rather a teleological equivalence — ^when measured in 
terms of the single function of all intelligence, 
namely, adaptation to new requirements. And for 
i;his equivalence of two intelligences mental age fur- 
nishes an approximate measure, despite the fact that 
their equivalence does not mean their identity. 

The area of irregularity yet further affects the computation of 
mental age and in a way to which sufficient attention has not al- 
ways been given. In order to equalize possible omissions in the 
lower test-levels, one must always have at one's disposal tests in 
higher levels. Now the original Biuet-Simon series comprised 
tests up to 13 years only : it follows that mental age 12 or 13 can- 
not be correctly computed, for tests from yet higher levels might 
perhaps have raised the total performance to a higher value. In 
using the 1908 Binet series, accordingly, computations ought to be 
carried up to mental age eleven only. 

The area of irregularity, again, affords another 
value in addition to mental age, viz.: the range of 
irregularity (Streuungsbreite). A child whose suc- 
cesses and failures are strewn irregularly over test- 
levels from 6 to 10 years has the same mental age, to 
be sure, but a very different range of irregularity, 
when compared with another whose mixture of suc- 
cesses and failures lies in the 7th to the 9th years 
only. Bobertag, who first gave attention to the im- 
portance of differences in ranges of irregnilarity, 
has devised a way of computing this factor ; I have 
myself suggested another way, but neither has been 
published as yet. 
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Yet, even with these methods, qualitative differ- 
ences in the area of irregularity are not touched, 
and for this reason it will be necessary in many cases 
to enter into a detailed analysis of the testing as well 
as to state the two resultant values (mental age and 
range of irregularity). It will often be distinctly 
worth while to determine in which tests there was 
special difiSeulty, in which special success. More- 
over, the value of observing the child during the test- 
ing must not be underestimated, for in many of the 
tests there are ways of setting about the task that 
may be of great interest (and for medical or peda- 
gogical judgment of the case, too), though these 
things would not be evident in the mere plus or minus 
set down for the outcome of the tests. We may al- 
lude, in this connection, among other things, to the 
kind of description given to the pictures, to the enu- 
meration of the 60 words, as well as to the behavior 
of the child when he arranges in order the five 
weights of like appearance but unlike weight. In 
this last it is not nearly so important whether the 
child finally gets the order right as it is to observe 
the child's manner of going to work — ^whether and 
how quickly he grasps the unaccustomed problem, 
whether he compares just two weights each time, or 
compares each weight with all the others when he 
puts it in place, or what not. Mn these investigations 
we should be warned, then, against the bare pursuit 
of numerical values : computation of such values and 
qualitative analysis must supplement one another, 
though, naturally, now the former and now the latter 
will receive special stress, according to the setting, 
of the problem. 
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But let us return to mental age. The full signifi- 
cance of this final value is disclosed only when we 
consider it in relation to other circumstances. It can 
evidently be related to other quantitative scales, like 
chronological age, school grade and school standing, 
or we can find out how it varies with certain qualita- 
tive conditions, like social level, type of school, na- 
tionality and the like. 

Doubtless most significant is the relation of mental 
age to the actual chronological age of the subject, for, 
as already said, a certain mental level goes normally 
with a certain age, so that the relation of mental to 
chronological age indicates the amount of discrep- 
ancy between the intelligence present and that re- 
quired (in the sense of a norm to be expected), and 
in this way affords an expression for the degree of 
the child's intellectual endowment. 

Up to now this discrepancy has always been com- 
puted in the simple form of the difference between 
the two ages, which, when negative gave the absolute 
mental retardation, when positive the absolute mental 
advance of the child in terms of years. Thus, if 
mental retardation = — 2, the child's mental de- 
velopment is two years behind the normal level of 
his age. 

It is perfectly clear how valuable the measurement 
of mental retardation is, particularly in the investi- 
gation of abnormal children. It has, however, been 
shown recently that the simple computation of the 
absolute difference between the two ages is not en- 
tirely adequate for this purpose, because this differ- 
ence does not mean the same thing at different ages^ 
(compare what is said in Section 4a, pp. 70 ff.) . Only 



42 PSYCHOLOGICAL METHODS OP TESTING INTELLIGENCE 

when children of approximately equal age-levels are 
under investigation can this value suffice: for all 
other cases the introduction of the mental quotient 
will be recommended farther on (cf. pp. 80 ff.). This^ 
value expresses not the difference, but the ratio of <r 
mental to chronological age and is thus partially in- ^ 
dependent of the absolute magnitude of chronological 3 
age. The formula is, then : mental quotient = mental i 
age -^ chronological age. With children who are just*^ 
at their normal level, the value is 1, with those who 
are advanced, the value is greater than unity, with 
those mentally retarded, a proper fraction. Th^^ 
more pronounced the feeble-mindedness, the smaller 
the value of the fraction. 

Another and last concept that 'mental age' sup- 
plies is that of mental arrest. This applies only to 
feeble-minded individuals and means a mental age 
that is not exceeded, despite increase of chronolog- 
ical age. 

3. Results with Normal Children 

The investigation of normal children forms a pre- 
condition of the whole method, since the norm for 
each age must first be determined upon them. Yet, 
at the same time, investigations of these children 
have already brought out a series of results that 
permit us to set no slight value on the future worth 
of intelligence tests for the problems of normal peda- 
gogy. Thus far, tests have been made chiefly upon 
children in the common schools of both sexes and 
of different ages, less often upon pupils of the higher 
schools. 
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(a) General distribution of the level of intelli- 
gence. In those investigations where there have 
been tested a large number of elementary school 
pupils of different ages and with no attempt at spe- 
cial selection, there could be worked out general sta- 
tistics of the number of children that are at, above 
or below the mental level of their age. I bring to- 
gether in the following table the percentages thus 
far obtained. 



DISTBIBtFTION OF THE LEVEL OF INTELLIGENCE FOR ALL AGES 
COLLECTIVELY. 



Difference, In Years, Between 
Mental and Chronological 
Age 

Binet (203 Children) 

Bobertag (261 Children)" 

Goddard (1277 Children)*... 



-2 


-1 





+1 


6. 


21.5 


51 


20.5 


3. 


19 


52 


22.5 


11 = 


20.5 


41.5 


21.5 



+2 

1. 
2.5 

5.5^ 



'Children from 5 to 10 years old. 
'Children from 5 to 11 years old. 
"Includes two or more years below or above age. 

Binet (37, p. 112) has brought together a frequency distribution 
of 203 normal children (ages not given). In this distribution we 
may note a remai'kable symmetry : almost exactly one-half of the 
children are 'at age,' a good quarter are 'below age,' and a scant 
quarter are 'above age.' 

Bobertag has called attention to this peculiarly simple sym- 
metrical numerical distribution that he had noted first in his own 
results and then found confirmed in Binet. 

Bobertag has just published his own frequency distribution. I 
take from it (40, II, Table I) the figures for 261 children between 
5 and 10 years. While here, again, the 'at age' children comprise 
half of all the cases, the divergence between the two other groups 
is but slight — the 'advanced' are somewhat more numerous than 
the 'retarded' children. 

A third set of data, derived from a much more extensive ma- 
terial, has been given us by Goddard (48), who has tested all the 
school children of a small American city (Vineland, N. J.). The 
distribution curve that Goddard has prepared from his raw figures 
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is certainly not usable, because be has included in it also the age- 
levels of 12 years and over, i. e., children for whom no more ade- 
quate tests were available from higher levels; the data for these 
subjects must therefore necessarily be thrown out. If we bring 
together only those children whose area of irregularity is of satis- 
factory scope, children, then, between 4 and 11 years old, we shall 
have 1277 children, and it is their percentual distribution that I 
have computed. In it the percentage of children 'at age' is some- 
what less, that of children 'above age' is approximately the same 
as Bobertag, while that of children 'below age' shows a plain, 
though not very large increase. 

When we stop to think that we have to do in these 
three investigations not only with children of differ- 
ent nationality, but also with different examiners, 
each of whom had his own way of setting the tests 
and of evaluating them, we can not lay great stress 
upon what discrepancy exists between the three sets 
of statistics : we may conclude from them that when 
a sufficiently large number of non-selected children 
of different ages are tested, their degree of intelli- 
gence will be distributed in a somewhat symtnetrical 
fashion. Approximately one-half (in America a 
scant half) stand at the level of their age; about a 
fifth (to a fourth) are a year retarded and the like 
number a year advanced; only a small percentage 
(at the most 11 per cent.) show more than one year 
of retardation and a still smaller fraction (at the 
most 5.5 per cent.) is mentally advanced by more 
than one year. 

One must be careful not to regard the 'at age' 
child and the 'normal' child as synonymous terms: 
on the contrary, the statistical results themselves 
show that the 'at age' children simply constitute the 
middle section of normality, while the children that 
are one year advanced or retarded are still com- 
pletely within the bounds of normality. 
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It is worth noting that the distribution just cited bears a cer- 
tain resemblance to the simplest type of distribution known as 
Gauss' frequency curve, for this latter is not only a symmetrical 
curve, but it is also divided by the value known as the 'probable 
error' into three sections, and in such a manner that the middle 
section comprises the half, and the two end-sections each one- 
quarter of all the cases. Even a generation ago Galton advanced 
the hypothesis that the abilities of a large group of non-selected 
individuals would be distributed symmetrically in the form of the 
Gaussian curve. It is true that Galton thought the Gaussian law 
of distribution could be extended to apply to a very detailed 
gradation (16 grades) of ability, whereas statistics at present 
available only make it probable for a few main groups. 

Bobertag has supplemented our knowledge of this 
matter by discovering that a similar distribution 
holds good on other occasions when a fairly large 
number of individuals is di^T.ded into good, medium 
and poor groups. In statistics of marks pertaining 
to 2772 pupils, it turned out that marks of "better 
than satisfactory" were assigned to 25.7 per cent., 
of "satisfactory" to 50.8 per cent., and of "unsatis- 
factory" to 23.5 per cent, of all cases (40, IE, 
Table IV). 

Yet it is well not to ascribe too great significance 
to these ratios of distribution. In the first place, 
the empirical data now at our disposal are not 
enough to warrant as yet the assumption of a gen- 
eral conformity to law; and even for the data now 
at our disposal the formula holds only as a rough 
approximation and merely as a general tendency for 
a rather large number of cases within which the 
numerous irregularities compensate each other 
(compare on this point the next section). Neverthe- 
less, the findings already secured are of sufficient in- 
terest to be followed up farther.^ 



"Further discussion of this principle of symmetrical distribution 
and its relation to the Gaussian curve may be found in one of my 
previous articles (I, 248 ft.) and In Bobertag (40, II). 
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The principle governing this distribution has, ho-w- 
ever, heuristic value even now in two ways: (1) 
When we are obliged to divide a group of persons on 
the basis of their mental ability into a good, a me- 
dium, and a poor group, the convenient and common 
division into three groups of equal size is certainly 
less close to the actual gradations than the setting 
off of a good and a poor quarter from the homoge- 
neous middle half of medium ability. (2) A require- 
ment, e. g., of a test or of a series of tests, may stand 
as normal for a given group of individuals when ap- 
proximately 75 per cent, of the members of the group 
meet it in a satisfactory or more than satisfactory 
manner. This idea has been used by Bobertag^ for 
the standardization of tests. 

{h) Different age-levels and nationalities. Grod- 
, dS,rd has thought that the symmetry of the curve of 
which we have just been speaking might be deemed 
proof that the Biuet-Simon arrangement of tests 
represents in a way an ideal series, because it has af- 
forded on empirical test a distribution that was 
theoretically to have been expected. But this con- 
clusion is unjustifiable. The curve of symmetry ap- 
plies primarily only to all children taken collec- 
tively, without regard to age; but the Binet- Simon 
tests should really embrace normal standards for 
children of every one of the series of ages and their 
correctness would be demonstrated only provided 
the symmetrical distribution were disclosed for 
normal unselected children of each single year. But 
this is by no means the case, and, as a matter of fact, 



'See Section 5a for details. 
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least of all in Goddard's own results. Eather is it 
true, as is evident from closer consideration, that 
the symmetrical curve above mentioned owes its 
existence to the fact that the varying results of dif- 
ferent years practically compensate each other.' In 
truth, the results of almost all who have tried out 
the Binet-Simon method, regardless of the nation- 
ality tested, agree that the series set for the loiver 
years are too easy, those for the higher too difficult. 
The evidence for this, so far as known to me, I have 
introduced in Table II. 

From Goddard's (48, p. 243), Bobertag's (40, II, Table I) and 
Miss Johnstoue's" raw tables I have computed the percentages of 
frequency for American, German and English children. Goddard's 
data I have figured for each year separately ; those for the two 
other investigators by bringing two or three years together, on 
account of the smaller number of cases (Table II). It will be 
seen that in the lower years many more are 'advanced' than there 
should be : in Goddard and in Johnstone the advanced outnumber 
not only the retarded, but even the 'at-age' children. Thus, for 
instance, in Goddard more than half the 5-year old children attain 
a mental age of 6 years or over — clear evidence that these tests 
are much too easy. In Bobertag the lack of symmetry is not so 
pronounced. 

The area of excessive percentage of advanced children (and thus 
of excessive ease of the tests) extends in Goddard through the 7th, 
in Bobertag through the 8th year ; in Johnstone it has not en- 
tirely disappeared even at the 9th year. Then comes a sudden re- 
versal : in the higher years the number of retarded children in- 
creases : the tests are therefore too hard. 

With the 155 subjects examined by Bloch and Preiss (38) there 
was made at the outset a selection such that only children of 
medium ability and school performance were tested. Consequently, 
retardation in mental age appeared almost not at all, but advance 
did appear, though in diminishing frequency with advancing age. 



'Ayres (31) calls attention to this point in his critique of 
Goddard. 

"In Miss Johnstone's original work (52) the quantitative data 
are not given sufficiently clearly, but these data are given by 
Binet (36, p. 196) where will be found a table of distribution for 
146 Sheffield school girls as Imparted in a letter from Miss John- 
stone. 
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From the data given it can be computed that there were above the 
level of their age a full 50 per cent, of the 7-year olds, 20 per cent, 
of the 8 and 9-year olds and only 14 per cent, of the 10 and 11-year 
olds. 

TABLE n. 
PBBCENTAGES EETAEDED, AT AGE AND ADVANCED AT DIFFERENT YEARS. 

Chronological 



Investigator 


Age 


Retarded 


At Age 


Advanced 




5 


12 


35 


53 




6 


20.5 


30 


49.5 




7 


13 


58 


29 


Goddard 


8 


44 


41 


15 




9 


40 


28 


32 




10 


27.5 


56 


16.5 




11 


56 


36 


8 




r 5-6 


11 


60 


29 


Bobertag • 


7-8 


7 


48.5 


44.5 




9-11 


34 


50 


16 




■ 6-7 


12 


20 


68 


Johnstone. ... 


8-9 


20 


40 


40 




10-11 


62 


25 


13 



In the work of the Americans, Terman and Childs (64), and of 
Mile. Descoeudres (46), of Geneva, we find another method of 
presenting data, but the same result. The first-named tested 396 
unselected children and figured the average value of each age; 
they found that the young children attained a much too high level, 
the older children a too low average level of intelligence, so that, 
on the whole, the mental levels were more like one another than 
were the chronological levels. It follows that the tests fail to 
bring fully to light the actual differences between the children. 
Mile. Descoeudres had tested in all only 24 children of six different 
ages ; the results showed differences of only two to four years in 
the mental ages of children in the youngest and oldest groups, 
though the chronological ages differed by six years. 

All these findings show, first of all, that the ar- 
rangement of the tests set forth by Binet and Simon 
in 1908 suffer from not inconsiderable errors that 
must be removed. Binet himself has recognized 
these defects, too, at least in part, for he subse- 
quently relegated the tests for 11, 12 and 13-year 
subjects to higher age-levels. 



THE METHOD OF AGE GRADATION 49 

TADLE ni. 
AVERAGES TOR CERTAIN YEARS. (TERMAN AND CHILDS.) 

Chronological Age Mental Age 
4.75 6.50 

7.50 8.00 

12.33 11.00 

But far more important is a positive result, viz. : 
tlie international accordance m the judgment as to 
special ease or special difficulty of certain test-levels. 
It is certainly not of minor significance that the 6- 
year old tests were too easy and the 11-year old as 
uniformly too difficult, with the 8 and 9-year old ap- 
parently forming a between-lying zone in the case 
of children in the common schools of America, Ger- 
many, France and England, all without exception. 
That, despite the differences in race and language, 
despite the divergences in school organization and 
in methods of instruction, there should be so decided 
agreement in the reactions of the children — is, in 
my opinion, the best vindication of the (principle 
of the tests that one could imagine, because this 
agreement demonstrates that the tests do actually 
reach and discover the general developmental condi- 
tions of intelligence (so far as these are operative in 
public school children of the present cultural epoch), 
and not mere fragments of knowledge and attain- 
ments acquired by chance?; 

And this confirmation of the principle may also 
lead us confidently to expect that the discrepancies 
that have been revealed at the same time in some of 
the details of the system can be obviated in the 
future. 
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(c) Children of different social strata. Social 
differences turn out otherwise than do differences 
of nationality, for they come out more or less con- 
spicuously in the results of the tests. The task of 
making comparative investigations by the graded 
tests of children of different social levels was imder- 
'taken in 1910 by Binet (36, p. 187) and by Breslau 
teachers, simultaneously. 

The incentive that led Binet to undertake this 
problem arose in certain investigations conducted 
by Decroly and Mile. Degand in a private school at 
Brussels (45), the results of which seemed to cast a 
measure of doubt upon the value of Binet 's tests, 
since the tests turned out, all of them, to be too easy. 
To be explicit, of 45 children tested, no one was be- 
low, 9 were at, and the rest were above the level of 
their age (13 by one year, 17 by two years, and 9 
even by three years). ^" Binet now points out, and 
rightly, that these figures present no argument what- 
soever against the value of his tests, but merely af- 
ford a positive contribution to the study of the dif- 
ferentiation conditioned by social factors. For all 
these Belgian children were sprung from the circles 
of the cultured middle class, whereas the Parisian 
children to whom the tests were 'fitted' belonged 
to lower classes. Binet, on this basis, reckons the 
average difference in mental age between children 
of the higher and lower classes at approximately a 
year and a half. Of course, this figure can stand 
only as a rough approximation ; it wiU vary, partic- 
ularly at different levels of chronological age — a 

"See the review by Bobertag, Zeits. f. angexo. Psych., 5 : p. 20§, 
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point to wHoli Binet, unfortunately, does not refer. 
Binet, himself, also induced some school directors 
of his acquaintance to take up tliis question in 
Paris. As a matter of fact, children in the superior 
schools were not considered, and the attempt was 
made merely to ascertain whether an influence of 
social environment could be discerned within the 
common schools. It is to be regretted that these 
tests were carried out upon but an extraordinarily 
small number of children. 

One investigation (p. 194) tbat was restricted to a single school 
came to no result. In this study there were examined 54 children, 
classified into four groups on the basis o( social status. It may be 
mere accident that relatively more advanced children were found 
among the poorest than among the other groups; but at any rate 
there was no trace of any positive relationship between mental 
age and social position. Probably, as Binet himself has already 
pointed out, the social differences present in this study were too 
small to affect the outcome. 

TABLE IV. 
DISTRIBUTION OP TWO GROUPS OF 30 PUBLIC SCHOOL CHILDREN EACH. 

, — Retarded — v 

2 Years 1 Year 
Poor Neighborhood. ... 1 11 

Good Neighborhood. ... 1 3 

On the other hand, a clear difference was revealed 
when comparison was made of two public schools 
(p. 198), one of which was situated in the poorest 
quarter, the other in a relatively well-to-do neigh- 
borhood of Paris. There were tested from each 
school 30 children of corresponding ages, selected 
without reference to their school performance. 
Table IV shows how much more numerous were the 
cases of retarded intelligence in the poorer school. 
Binet figures the average superiority in mental age 





,■ — Advanced — , 


At Age 


1 Year 2 Years 


13 


4 1 


10 


10 6 



52 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE 

of the children of the better situated school to be 
three quarters of a year." 

A question as interesting as it is difficult to answer 
arises when we seek the causes of these differences 
in performance. It would obviously be very prema- 
ture to assume as already positively demonstrated 
that the intelligence, considered as innate mental 
ability, was of lower grade in children of the lower 
and poorer classes. Of course, it is not impossible 
that this may have been operative as a causal factor. 
One might, perhaps, assume that the very rise into 
the higher and better-off classes would itself predi- 
cate a certain intellectual selection, and that thus 
the children of these classes would have come into the 
world equipped with a superior intellectual endow- 
ment. 

^ But, on the other hand, it must be remembered 
that no series of tests, however skillfully selected it 
may be, does reach the innate intellectual endow- 
ment,, stripped of all complications, but rather this 
endowment in conjimction with all the influences to 

"Since Stern assembled this material there has appeared an 
American study that does not confirm the general principle of en- 
vironmental Influence (J. Weintrob and R. Weintrob. The Influ- 
ence of Environment on Mental Ability as Shown by Binet-Slmon 
Tests. Jour, of Educ. Psych., 3: 1912, 577-583). The subjects 
were 210 children, 70 from the Horace Mann School of Teachers 
College, Columbia University, representing children from wealthy, 
or at least very well-to-do families, 70 from the Speyer School, 
representing families of the "comfortable middle class" (wage- 
earners and small-business men), and 70 from the Hebrew Shelter- 
ing Orphan Asylum of New York, who were children springing 
from a very unfavorable environment. While the relatively small 
number of cases and the difference of nationality may render the 
outcome less conclusive, the results from the three institutions 
"showed very small and inconsistent differences." The original 
article should be consulted for fiu-ther analysis of the data.— 
Translator. 
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which the examinee has been subjected up to the 
moment of the testing. And it is just these external 
influences that are different in the lower social 
classes. Children of higher social status are much 
more often in the company of adults, are stimulated 
in manifold ways, are busy in play and amusements 
with things that require thinking, acquire a totally 
different vocabulary and a notable command of lan- 
guage, and receive better school instruction ; all this 
must bring it about that they meet the demands of 
the tests better than children of the uncultured 
classes. } 

Presumably, each of these factors, internal and 
external, endowment and environmental influences, 
plays a role in the result ; but we shall have to wait 
until very many, more extensive investigations have 
been made before we can secure more exact knowl- 
edge of the actual amount and range of influence 
possessed by the one or the other of them. The way 
to approach this problem is by special analysis of 
the data: it will be necessary to find out in which 
tests the superiority of the children of the cultured 
classes is particularly evident and which tests are 

passed with equal facility by children of both classes. 

--, I 

The material for a preliminary comparison of this sort has been 
drawn by BInet (36, p. 191) from the tables of Deeroly and 
Degand. In them it is very interesting to note that the special 
superiority shown by better situated children is in those tests that 
involve thinking In the true sense of the term — apprehension, com- 
parison, criticism, formation of concepts, synthesis, etc., though, 
it must be admitted, most of them put a premium on linguistic 
readiness. The tests here included are: description and explana- 
tion of pictures, comparison of two objects, definition of abstract 
terms, recognition of omissions in drawings, criticism of absurd 
statements, arrangements of the five weights, naming 60 words In 
three minutes. To these are to be added certain tests that ob- 
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viously depend more upon external circumstances, like knowing 
the days of the week, the months of the year and coins. On the 
other hand, the tests that Binet designates as revealing social dif- 
ferences only slightly are for the most part those that hinge on 
school instruction, as copying, writing from dictation, counting 
backwards, making change, drawing a diamond. Only a single one 
of the tests that fail to reveal social differences is a real test of 
intelligence — the completion of gaps in a text. However, in view 
of the small number of children that could be used to base these 
results upon, any generalization of the conclusions from them is 
Ho be avoided. 

This problem of social differences and their effect 
upon intelligence leads over directly to certain prac- 
tical pedagogical principles. We may think, in this 
connection, for instance, of the demand [in Ger- 
many] for the establishment of the 'common' school 
I (Einheitsschule) , in which children of all classes of 
' society shall be included without distinction/^ It 
seems to me that in the discussion of this problem, 
just as in the problem of co-education, the purely 
psychological presuppositions are kept too little 
in mind because the socio-ethical phase of the ques- 
tion tends to claim first attention. 
j But how the psychological methods of testing in- 
telligence can become of direct service for these 
practical questions will be shown, I hope, by an in- 
vestigation with the Binet-Simon tests that is now 
being undertaken by a group of teachers in Breslau. 
■ The problem under study istlat of a systematic 
comparison of pupils in a Volksschule and those in a 
\ Vorschule, i. e., the younger pupils in the Vorschule 
of a Oymnasium^^ The aim is to find out whether 

^^See the footnote on p. 55. — Translator. 

^A.& It is impossible to render these terms in English equiva- 
lents, it is proper to explain that the German VolJcsschule is the 
elementary public school attended by children of the laboring or 
lower business classes. In it attendance is absolutely compulsory 
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there exist typical differences of intelligence be- 
tween groups of children of the same age, and what 
magnitude these differences attain at different ages. 
In Prussia pupils may enter the 8exta (lowest class) 
of the Gymnasium after three years in the Vorschule, 
but only after four years in the Volksschule. The 
tests were also aimed to discover to what extent this 
rule is psychologically justified, not only by the dif- 
ferences in the curricula of the two schools, but also 
by the general mental maturity of the children. 

Five groups were tested that had been carefully 
planned to be comparable in the matter of age, viz. : 
7 and 9 year old pupils of the Vorschule, and 7, 9 and 
10 year old pupils in the Volksschule — in all about 
150 boys. (See above, pp. 35 f., for some of the pre- 
cautionary rules observed in testing) . The results are 
now being worked out ; but, thanks to the courtesy 
of the investigators, I have been able to get some of 



from 6 to 14, unless the child is otherwise instructed. The Gym- 
nasium is one of a number of so-called 'higher' or 'secondary' 
schools with a 9-year curriculum (ages 9 to 18, or more), and 
preparatory for entrance Into the university. Children of the bet- 
ter classes, destined for higher education, enter the Qymnasiuin 
(or some variant of It) after a preliminary three-year training 
(ages 6 to 9) in a Vorschule, which is thus virtually a special ele- 
mentary school for better-class children. Relatively infrequently 
do children started in the Volksschule later enter the Gymnasium. 

A demand is now being made by certain interests in Germany 
for the abolishment of these distinctions, at least in part, by com- 
pelling all children to begin school instruction In the same school 
(Einheitssohule) — a proposal which has been, and is, the occasion 
of very active, and even bitter discussion. 

I have given a somewhat fuller explanation of the German 
school system in Appendix II of my translation of Offner's Mental 
Fatigue, an earlier number of this series of Educational Psy- 
chology Monographs. — Translator. 
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them now, and from them I have prepared the fig- 
ures that appear in Table V. These figures, which 
must be regarded as strictly provisional, merely in- 
dicate with what percentual frequency all the tests, 
taken collectively, for which I have the data, have 
been passed, and this, it should be noted, for the 
three older groups of children only.' 



. 14 



TABLE V. 
PEECENTA6B OF TESTS PASSED IN CERTAIN AGE-LEVELS AT BEBSLAU. 

9-12 9 and 10 11 and 12 

9-Year Vorschule Pupils 70 77 64 

9-Tear Volksschule Pupils 60 81 34 

10-Year Volksschule Pupils 70 86 46 

The first column shows that the 9-year old Volks- 
schule pupils rank in the number of tests passed 10 
per cent, below the pupils of the same age in the bet- 
ter school, while the 10-year old Volksschule pupils 
attain the same measure of success as the Vorschule 
pupils a year younger. That, however, there is no 
real equality in this relationship is shown by the two 
other columns in which the percentage for the easier 
tests (9 and 10 levels) and the harder ones (levels 11 
and 12) are calculated separately. While, in the 
easier tasks the Vorschule pupils, curiously enough, 
rank a little below the Volksschule pupils of their 
own age and 9 per cent, below the older pupils in 
that school, the outcome is quite different when we 
pass to the harder tasks (third column). These 
tests which lie above the age-level of the subjects 
are passed by the Vorschule pupils nearly twice as 
well as by their mates of like age in the Volksschule, 

"For many very important tests, as for instance, description of 
the pictures, no results are at my disposal yet. 
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and even the older pupils in these tests fall 18 per 
cent, behind the younger children of the better 
school. If this interesting result should be con- 
firmed again in the detailed computations, as it prob- 
ably will be, we should then say: ^ildren of differ- i 
ent social classes differ from each other less in the! 
performances appropriate to their age than in the] 
mastery of tasks that really lie above their levet^ 
We would have, then, a numerical demonstration for 
that well-known early ripening of children of the 
higher classes, for the anticipation of phenomena of 
developmental stages yet to come before the content 
of the current stage of development is fully ex- 
hausted. 

We may look forward with interest to the final re- 
sult of these investigations. 

The material of Table V also furnishes further 
confirmation of a law of differential psychology : the 
more complex a mental function, the more difficult 
to bring it into action, the later its appearance in 
the course of development, then so much the greater 
is its variability, and so much the more definitely are 
men and groups of men differentiated by it (cf. 1, 
pp. 258 and 269). 

(d) Intelligence and school performance. The 
relation of these two factors is easily the most im- 
portant problem presented by our theme for prac- 
tical pedagogy. For at this point we may hope to 
get an insight into the factors that condition the 
progress of children within the school, the place that 
they take among their fellow-pupils on the basis of 
their work, and the way their marks turn out. Peo- 
ple are generally inclined to think there is a very 
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close connection between intellectual ability and 
school ability: good pupils are forthwith regarded 
as intelligent, and good school work is, with a cer- 
tain obviousness, expected of intelligent children, 
poor school work of the poor groups. So long, of 
course, as we had no special means of testing intelli- 
gence, there was no foundation on which to build up 
more exact knowledge of these interrelations: we 
had to content ourselves with opinions and with the 
generalization of occasional observations. 

But now we are beginning to get on firmer ground. 
Tests of intelligence have already taught us that the 
relations between intelligence and school ability are 
by no means so strict and uniform as most persons 
had thought. Just here we are concerned with the 
conclusions reached by the Binet-Simon method with 
normal children, but we shall encounter the same re- 
sult later on in two places (II, 4c and III, 3). 

We have two measures for the school capacity of a 
child that we want to compare with his intelligence — 
his pedagogical age and his marks. 

Pedagogical age is the normal age of the class to 
which the child belongs. If we assume that school- 
ing begins at 6, then the pedagogical age of a class 
that is just entering upon its fourth school year is 
6 + 3 = 9 years. If there is in this class a child 11 
years old, he then has a pedagogical retardation of 
two years, while an 8-year old classmate has an ad- 
vancement, or acceleration, of one year. The latter 
is very rare with us, on account of the exact way in 
which promotions are regulated ; cases of it appear 
mostly when a child enters after private prepara- 
tion or from another school. Outside of Germany 
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cases of pedagogical advance seem to be more com- 
mon. Eetardations are, however, quite frequent in 
consequence of non-promotion, long illness, etc., and 
sometimes they reach a considerable degree. 

Comparisons of pedagogical and mental age have 
been made by Binet and by Goddard. 

TABLE VI. 
EELATION OF PEDAGOGICAL AND MENTAL AGE (BINET). 

, Mental Age , 

Pedagogical Age Retarded At Level Advanced Total 

Retarded 14 9 1 24 

Normal 16 33 16 65 

Advanced 5 7 12 

Total 30 47 24 101 

Binet (36, p. 162) presents a distribution-table for 
101 pupils and regards the agreement as tolerably 
satisfactory (Table VI). In fact, we do note that 
there are no paradoxical cases : no one of the children 
with mental retardation is pedagogically advanced, 
and only a single mentally advanced child turns out 
to be pedagogically retarded (and that case may be 
conditioned by ilbiess). Yet in the remainder of the 
Table there are divergences of considerable magni- 
tude: only a scant third of the mentally advanced 
are also pedagogically advanced; less than half of 
the mentally retarded are likewise pedagogically re- 
tarded, while, of the pupils 'at age' pedagogically, 
one quarter surpass and another quarter fall short 
of the mental level of their age. 

An exact computation of these relations can be 
made by using the method of contingency .^° Con- 

"The formula for It is developed in another of my treatises 
(1, 308 fl.). 
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tingency means the degree of correspondence be- 
tween two intersecting groups. If all children 
pedagogically retarded should also exhibit mental 
retardation, or if the converse should occur, then the 
contingency would be absolute (= 1) ; if among the 
pedagogically retarded children there were rela- 
tively no more mentally retarded than among the 
children with normal or with superior school attain- 
ments, then the contingency would be = 0. The de- 
gree of correspondence can be shown by a number 
lying between and 1, termed the coefficient of con- 
tingency. 

From the above tables I have computed the follow- 
ing values : 







Degree of 


First Factor 


Second Factor 


Correspondence 


Pedagogical retardation 


Mental retardation 


0.41 


Mental retardation 


Pedagogical retardation 


0.30 


Pedagogical advance 


Mental advance 


0.45 


Mental advance 


Pedagogical advance 


0.19 



The index of correspondence, then, is but moder- 
ately large at best and even that only when we pass 
from school ability to intelligence, not in the re- 
verse direction. Hence, to draw conclusions about 
school status from varying intellectual abilities is 
even less permissible than to draw conclusions about 
intellectual ability from varying school status. 

In his mass-experiment, Uoddard (48) came to a 
similar result. He found that more than the half of 
all the children tested were in classes that did not 
correspond to their mental age — most of them, as a 
matter of fact, in a lower and only a few in a higher 
class. 

Bobertag compared mental age with the school 
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marks (40, II, p. 501, Table II). In Ms table of dis- 
tribution (Table VII), too, there are no paradoxical 
cases. As for the rest, the coefficients of contin- 
gency are, according to my calculation, higher than 
with Binet, but still, however, of only moderate mag- 
nitude : 

TABLE VII 
RELATION OP MENTAL AGE AND SCHOOL MARKS (BOBEETAO) 

, Mental Age > 

School Marks Retarded At Level Advanced Total 

Poor 29 17 46 

Satisfactory 26 79 21 126 

Good 13 31 44 

Total 55 109 52 216 

First Factor Second Factor Correspondence 

Poor marks Mental retardation 0.52 

Mental retardation Poor marks 0.40 

Good marks Mental advance 0.59 

Mental advance Good marks 0.47 

Here, again, it appears that inference from school 
performance to mental ability is safer than from 
mental ability to school performance, though here 
the correspondence between intelligence and the 
school performance is not so slight as with Binet, as 
above cited. 

What, now, is the significance of this lack of com- 
plete agreement between school efficiency and the 
outcome of the tests of intelligence? 

In the first place one might say that this was an- 
other proof of the defectiveness of the tests. That, 
since pedagogical age and school marks are the con- 
densed formulation or expression of the long-con- 
tinued and many-sided efficiency of the child and 
hence much more characteristic than the outcome of 
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a half -hour's testing, we would place confidence in 
the latter only if it agreed with the former ; that if 
it did not, then the tests or at least the gradation 
derived from them would amount to nothing. 

Now we have already alluded in what has gone 
before to the weakness of the gradations of intelli- 
gence discovered by the Binet-Simon method, and 
it is entirely probable that this insufficiency has con- 
tributed in part to the lack of agreement with school 
performances/* Since, for example, the tests for 
7-year old children are too easy, many less gifted 
7-year old children will reach the level of their age 
as a result of the testing, although they do not rank 
as "satisfactory" in the school. With the older 
children the reverse will obtain. Nevertheless, I 
do not think that this is the only cause of the lack of 
agreement: the true cause lies in something more 
fundamental. 

In the second place, one might believe that a true 
picture of mental endowment was given only by the 
tests, and that the blame for the disagreement 
should be ascribed entirely to the school; that the 
teachers had estimated the pupils wrongly when 
they assigned them marks not in accord with their 
mental level, and had treated them wrongly when 
they kept them back in a class beyond which they 
should have gone according to their mental level. 
In this vein, for instance, Goddard writes, for he 
refers this phenomenon ahnost entirely to a faulty 
system of promotion (48, pp. 241 and 249). 

But to dispose of the matter in that way is to 
"pour out the baby with the bath." Of course, the 

"Tbls point has been made by Bobertag and others as well. 
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teachers, being human, make mistakes and not a few 
of the measures they adopt may be based upon mis- 
taken judgment of the mental maturity of the pupil, 
but it is inconceivable that half of all children should 
be victims of such mistakes. 

It seems to me, rather, that the results we have 
just been discussing themselves show that both of 
the opinions just cited are wrong. Complete agree- 
ment between school ability and intellectual ability, 
is not to be expected at all nor even to be desired,] 
because performance in the school depends not only] 
upon intelligence, but also iipon certain other and 
quite different factors. Thus, strength of memory,"' 
which, as is well known, is correlated only to a mod- 
erate degree with intelligence, certainly plays a 
large — perhaps a too large — role in the carrying on 
of school activities and in the estimate of their 
worth; the various special talents, too, cut across 
and modify the action of general intelligence. But 
beside this there are concerned factors that have 
nothing at all to do with intellect, but belong to 
the domain of will, in the widest sense of that term : 
I mean the degree and duration of attention, in- 
dustry and conscientiousness, sense of duty and 
capacity to fit into the social group. 

These are the essential elements that must be 
added to intelligence in order to transform mere 
potential to actual accomplishment, and these same 
elements are enough, even when conjoined with in- 
tellectual ability of lesser degrees, to produce ef- 
ficiency of a worthy degree. This is true in life, 
and it is true also even in the school ; and it is good 
that for once these relations should be brought out 
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clearly by immerical evidence. For tlie figiires in 
the tables above do sbow just this, that intelligence 
is never more than a partial factor in school activ- 
ity : and this demonstration may serve to refute that 
one-sided intellectnalism that notes and values in 
pupils onlyTheiFTntellectual ability. Not that in- 
tellectual endowment is not still to be regarded as 
a factor of chief importance : in truth when by tests 
of intelligence and other psychological devices we 
shall have obtained a more exact knowledge of it, 
there will be much of profit for the schools and many 
mistakes and wrong courses of procedure can be 
prevented, and this so much the more as we get 
clear ideas of the range and limits of its meaning 
and importance. If, for instance, a given pupil 
shows only a moderate success in the tests of intelli- 
gence but does distinctly good work at school, and 
if there is no chance that a special talent might have 
exerted a decided influence (which could easily be 
recognized if existent), then there is a probability 
approximating to certainty that this pupil's strength 
is to be sought primarily in qualities of character 
and will. 

Accordingly, the lack of agreement between tests 
of intelligence and school performance is really cal- 
culated to increase our confidence in the psycholog- 
ical test-methods. In this connection Kramer very 
pertinently remarks." "Had we found a strict 
parallelism between the results of the testing of in- 
telligence and the school performance, we should 



"See reference 54, pp. 30-31. Kramer was alluding to the ex- 
amination of abnormal children, but what he says applies to nor- 
mal cases as well. 
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have had to have felt the greatest distrust of the 
method. It would have raised the suspicion that we 
were doing nothing more than testing the school at- 
tainments themselves, either directly or indirectly, 
in which event the method would be futile for test- 
ing native endowment and its application would be 
superfluous, for we would need only to resort to the 
school performance directly for the information." 

(e) Sex differences. Comparisons of the mental 
abilities of boys and girls have already been carried 
out in large numbers in experimental psychology, 
but they have been almost entirely confined to single 
tests," whereas the Binet-Simon serial tests have 
been used to but a surprisingly slight extent in 
the comparison of the sexes and have not yet led to 
positive conclusions. I confine myself to a brief ex- 
position of the material in question. 

Goddard tested 835 boys and 712 girls. Unfortu- 
nately, he has thrown together the data for the dif- 
ferent ages : it follows that his figures (48, p. 250) 
lose much of their value for comparative purposes, 
because retardation and advance have quite differ- 
ent meanings at different age-levels. Nevertheless, 
we may reproduce here the table of distribution for 
the children (which I have converted into percents). 

The tabular results suggest a slight inferiority of 
the boys, most evident in the group of those retarded 

'The literature has been brought together by me elsewhere 
(1: Bibliography, Section VI) ; here we may cite the general sum- 
maries of the results of tests by Meyer and Wreschner, and the ex- 
tensive original studies of Cohn and Dieffenbacher (Nos. 1048, 
1072 and 104 in the bibliography just cited). As one pretty gen- 
erally confirmed result may be mentioned, among others, that with 
the Ebbinghaus completion method girls are clearly inferior to 
boys of the same age. 





, Advanced > 




One Two Years 


At Age 


Tear or More 


34.5 


20 4 


36.5 


23 5 



66 PSYCHOLOGICAL METHODS OP TESTING INTELLIGENCH 

one year to which 23 per cent, of the boys, but only 
17 per cent, of the girls belong; the girls show a 
correspondingly greater percentage of their rnim- 
ber at or above age. 

TABLE Vm 
SEX DirrEBENCES AS SHOWN BY BINKT TESTS (GODDAED) 

, Retarded , 

Two Years One 
or More Year 

Boys 18.5 23 

Girls 18.5 17 

Goddard's statement that retardation of marked 
amount is more frequent with boys is not borne out 
by his own tables, for the percentage of boys and 
girls is here the same, 18.5.^" 

All the other investigators that have treated the 
question of sex differences have obtained results 
more favorable for the boys. 

Particularly decisive are the results obtained by 
Bloch and Preiss (38) upon Volksschule children in 
the manufacturing city of Kattowitz, in Upper 
Silesia. They tested 79 boys and 71 girls aged 7 to 
11 years, all of whom displayed average native abil- 
ity and average school ability. The percentages 
passing successfully the various tests show almost 
in every one of them a very decided inferiority of 
the girls. In Table IX I have brought together all 
the tests for which Bloch and Preiss report the re- 
sults separately for the two sexes. No particular 

"It must, of course, be borne in mind that these are pupils of 
schools for normal children, but the statement appears to be equally 
untrue for abnormal children. From one of Chotzen's tables (44, 
p. 462) I have calculated that excessive retardation, 5 years or 
more, appeared in 7 of 158 feeble-minded boys (4.5 per cent), but 
in 11 of 122 girls (9 per cent). 
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sex difference appears in the description of pictures 
and in the definition of abstract terms, and there is 
a slight superiority of the girls in the "hard" prob- 
lem-questions; but in all the other tests the boys 
afford much higher percentages of success, often 
more than twice as high. Take, for instance, the 8- 
year old children: more than half of the boys, but 
not a single girl can arrange the five weights cor- 
rectly; four-fifths of all 8-year old boys recall cor- 
rectly what they have read, solve the easy problem- 
questions and state correctly the difference in things 
recalled in memory, whereas the percentage of girls 

TABLE IX 
SEX DIFFERENCES AS SHOWN BY BINET TESTS (BLOCH AND PKEISS) 



Test 
Description of Pictures. 



Memory of Story Read 

Arranging Three Weights. 



Arranging Five Weights. 



Easy Problem-questions. 



Hard Problem-questions 

Defining Abstract Terms 

Making a Sentence with Three Words 

Arranging Words into Sentence.... 

Naming 60 Words in 3 Minutes 

Detecting Absurdities . 



Comparing Objects from Memory j 





r-Percentage of— > 




Children Passing 




the Test 


kge 


Boys 


Girls 




No 


difference 


7 


No 


difference 


8 


80 


28 


9 


No 


difference 


7 


73 


33 


8 


56 





9 


66 


29 


10 


70 


44 


11 


77 


42 


8 


81 


55 


9 


90 


76 


10 


100 


100 


9 


25 


41 


10 


70 


80 


11 


70 


80 




No 


difference 


9 


70 


38 


10 


82 


40 


11 


100 


100 


11 


70 


33 




76 


50 


ii 


77 


40 


7 


60 


50 


8 


80 


55 
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passing these tests successfully is only 28, 55 and 
55, respectively. Where the same test runs through 
several years, the sex difference is nearly always 
greater in the younger than in the older children. 
This corresponds, again, with the psychological law 
that mental differences stand out more clearly in 
difficult than in easy tasks. 

Bloch and Preiss themselves point out that the 
number of persons upon which these results are 
based is too small to warrant final conclusions, but 
it is surely worthy of note that the inferiority of the 
girls extends to so many different kinds of tests. 

Bobertag (40, II, pp. 503-4) compared the same 
number of boys and girls of each age that ranked 
average in their school work. In each age the mental 
age of the boys turned out to be slightly above that 
of the girls ; the difference amounted to 1/7 year in 
the 8, 9 and 12-year old pupils, and to 1/5 year in the 
10 and 11-year old pupils. 

Mile. Descoeudres (46) compared a very small 
number of pupils — one intelligent and one unintelli- 
gent boy and a like pair of girls from each of six 
chronological ages. Taking all the right answers 
together, the boys had 52, the girls 48 per cent. 
There is here, then, also, a superiority of the boys, 
though the amount of the difference is not, of course, 
significant. 

(/) Repeated tests of the same children. Atten- 
tion must be called to one other important experi- 
ment included in the article of Bobertag's already 
mentioned (40, II) — an experiment that differs fun- 
damentally from all that have been conducted here- 
tofore. Bobertag retested in the year after a large 
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number of the children (83 in all) that he had tested 
in 1909. The reapplication of the same tests does 
not seem to have caused any noticeable difficulty, be- 
cause the memory of the details of the testing of the 
year before had as good as entirely disappeared. 
This experiment throws light upon three problems. 
In the first place it sheds an unexpectedly favorable 
light upon the reliability of the test method. Bober- 
tag arranged the 83 children in order on the basis of 
the number of tests solved by each of them and 
found that the order in the two years coincided very 
closely, iu fact the correlation amounted to 0.95. 
Accordingly, even if the absolute grading into the 
different age-levels of intelligence that the Binet- 
Simon method affords is still somewhat uncertain, 
yet it is demonstrably very certain in its relative 
gradings. The position that a child takes in a group 
of children on the basis of a single testing of his in- 
telligence may be deemed to possess a high degree 
of reliability. 

In the second place there comes to light a clear 
relation between the mental status of a child and the 
rate of his subsequent intellectual development. 
Those children that ranked 'at age' in the first test- 
ing had advanced next year exactly one year, on the 
average, while the retarded children had advanced 
only two-thirds of a year, and the advanced children 
one year and a quarter in the same period. 

In the third place Bobertag found that the num- 
ber of children that deviated, either above or below 
the level of their age, increased as their age in- 
creased. It follows from this that, as chronological 
age increases, the gradation of ages becomes pro- 
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gressively of less significance as a standard of varia- 
tion : an intelligence that in the earlier years deviates 
above or below the level of its age by even less than 
a single year -will in later years exceed this unit of 
deviation, which has then become relatively smaller. 
The same result had already been arrived at in in- 
vestigating abnormal children, as will be shown in 
the following section. 

^ 4. Abnormal Children 

(a) Mental arrest a/nd mental retardation. The 
mental quotient. When Binet devised his system of 
tests, he had particularly in mind the testing of ab- 
normal children in order that children of this type 
could be recognized opportunely and transferred to 
the special classes and to the institutions for the 
feeble-minded^^ Furthermore, Binet, together with 
Simon, tried out his method upon a large number of 
such children, though, unfortunately, he has given 
us no detailed account of this investigation, but he 
did draw conclusions from his experiments that 
express the relation of feeble-mindedness to his 
method in very simple formulas. One of these 
theses refers to mental retardation and runs thus 
(38, p. 113) : "I am for my part of the opinion that 
every mental retardation amounting to two years 
can be regarded as a serious deficiency." A second 
of these theses refers to mental arrest and declares 
that the imbecile does not progress beyond the 
mental age of seven, the moron (feeble-minded in the 
narrower sense) beyond the mental age of nine.* 

♦By other investigators and elsewhere by Binet the upper limit 
of moronity is placed at 12 years. — Translator. 
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The second investigation of feeble-minded chil- 
dren, that of Goddard (47) likewise suffers from 
lack of sufficiently detailed data. Goddard tested 
the children and adults ia the Institution for the 
Feeble-Minded at Vineland, N. J., nearly 400 per- 
sons in all, using the 1908 Binet series. He reports, 
however, only the frequency with which the several 
age-levels were reached and does not relate these 
data to the chronological ages of his subjects, so that 
it is quite impossible to determine the degree of re- 
tardation from his tables. We can only derive cer- 
tain conclusions that will be mentioned later on. (\p^ 

The only thorough investigations that have thus 
far been made upon large numbers of abnormal chil- 
dren are, accordingly, the tests made at Breslau by ^ 
the psychiatrist Earner (54) and Chotzen (in con- " 
junction with Nicolauer (43, 44) ). These investiga- 
tors, by testing children of different types, have sup- 
plemented each other's work in a fortunate manner. 
Kramer's material consisted partly of young per- 
sons who had been brought before the juvenile court 
and referred thence to the psychiatrist for expert 
opinion, partly of children that had visited the 
clinics on account of mental or nervous affections. 
Chotzen applied the tests in his capacity of city 
school medical inspector for special classes: he, 
therefore, tested all the children that were newly 
turned over to the special school for defectives. 
While Kramer had to do mostly with older children, 
Chotzen 's business led him to deal mostly with chil- 
dren aged eight and nine years, but he extended his 
investigation by including some of the older children 
in the special school. The technique was patterned 
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exactly after that followed by Bobertag, who had 
himself tested a group of abnormal children as well 
as normal children. 

Both of these investigators express a favorable 
opinion of the value of the method for their pur- 
poses. Thus Kramer writes : 

"In summing up our results we might say, first of all, that we 
are very much satisfied with the method for our purposes. Leav- 
ing the quantitative results entirely out of consideration, we came 
in the course of the testing, on account of the varied nature of the 
tests, to get acquainted with the peculiarities of the child's make- 
up, to understand surprisingly well his response to requirements of 
a varied sort, and acquired valuable insight into the qualitative 
differences in the method of reaction displayed by the feeble- 
minded. In the case of the children sent by the central office for 
corrective treatment, most of whom we could get hold of but for a 
single examination, the relatively short time that was needed 
(about 45 minutes to one hour) to reach a reliable judgment con- 
cerning their intelligence proved to be an exceptionally agreeable 
feature of the work. In all the cases in which a judgment con- 
cerning the intelligence could be reached by anamnestic data or on 
the basis of clinical observations themselves, there resulted with 
but few exceptions no contradictions with the outcome of the 
Binet testing" (54, p. 27). 

To turn, now, to the figures : To begin with the 
second of the two theses of Binet that we cited above, 
his assertion of the existence of a "mental arrest" 
has also found confirmation in other directions. 
This thesis may be stated thus: For every feeble- 
minded child there is a level ivhich, once attained, 
represents a definite terminus for his capacities to 
meet the demands of mental tests. That is, even 
though his age advances, his capacities do not ad- 
vance further than this level. 

Goddard found that the inmates of his institution 
were distributed in terms of mental age rather uni- 
formly over the age-levels from one to nine years 
(with approximately 10 to 11 per cent, in each year), 
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whereas the levels 10 to 12, taken together, com- 
prised only 7 per cent, of the total number. Though 
here, again, he unfortunately put together those 
children whose age was such that they might per- 
haps have been able later to transcend the level in 
which they were then found and the other inmates 
whose development had for long been completely 
checked, yet his results do at least demonstrate that 
the feeble-minded only rarely transcend the mental 
age of nine. 

By comparing these mental ages with the diag- 
noses of the physicians he arrived at the following 
schema : 

, -Imbeciles , 

Low- Middle- High- 
Type Idiots grade grade grade Morons 
Mental ages lto2 3 to 4 5 6 to 7 8 to 12 

Goddard's 'morons' coincide with our 'mentally 
feeble' (Debilen). The figures just given show that 
by far the greater number of them have a mental age 
of 8 and 9 years. 

Kramer (54, p. 29) and Chotzen (44 , p. 494) 
reached similar results. 

Goddard compared the experimentally determined 
mental age with the general impression that the 
children had made on the teachers and officers of 
the institution and found a very satisfying amount 
of agreement. The children of a given mental age 
formed a fairly homogeneous group, both in respect 
to their every-day accomplishments and their ability 
to adapt themselves to the demands of institutional 
life. He adds to this a description of what can be 
expected in the line of practical behavior of a child 
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of a given mental age. But all of these statements 
stand very much in need of further testing. 

It is well to give here explicit warning against a 
certain false conception of the term "arrest." An 
imbecile who, during his life, never progresses past 
the mental age of seven, is not on that account to be 
thought of as the same as a seven-year old child. 
He does grow beyond that status in many respects : 
he acquires experiences that a normal 7-year old 
child does not possess, picks up many accomplish- 
ments, experiences the awakening within himself of 
impulses and needs that come with increasing years. 
The arrest, then, pertains only to that specific group 
of mental abilities that are tested by the tests. And 
even some ones of these abilities may show some de- 
velopment (cf. in this connection p. 86), only there 
still remain so many defects of a fundamental na- 
ture that, all in all, it is impossible for him to rise 
above the mental age of seven. 

Of importance is, furthermore, the discovery that 
Goddard made concerning the mental age of a spe^ 
cial group, the morally feeble-minded. It turned out 
that this group was recruited solely from the upper 
of the age-levels represented in the institution. Of 
22 such individuals, 15 had the mental age 9, 5 the 
age 10 and one each the ages 11 and 12. The circum- 
stance that moral defects do not extend down be- 
yond the mental age nine is explained by Goddard 
in the following way: Certain immoral instincts, 
like the impulse to lie, to steal, etc., normally awaken 
about the ninth year; later on reasoning develops 
and puts these tendencies under inhibition. With 
children whose mental age is below nine those in- 



THE METHOD OF AGE GRADATION 75 

stincts are not yet developed, whereas with children 
who are arrested at about the mental age of nine, 
the instincts do show themselves without getting far 
enough along to develop the inhibition and so be- 
come a moral defect. 

We may leave undecided the question of the cor- 
rectness of this explanation, but in any event the 
fact remains that pronounced retardation in moral- 
ity is not associated with equally pronounced intel- 
lectual deficiency. The moral deficiency therefore 
displays a certain independence in its existence, and 
to that extent the old designation "moral insanity" 
was not utterly devoid of significance. 

We may allude, also, at this point, to a very sim- 
ilar conclusion reached by Kramer, who must, natu- 
rally, have encountered this type frequently among 
his criminal subjects. He says: "We have to do 
here with individuals whose defectiveness is on the 
moral side and in whom there can be noted even 
from their early youth a decided lack of moral ideas 
and altruistic spirit. In raising the question as to 
in how far these moral defects exist independent of 
intellectual deficiency, it is worth noting that in the 
examination a number of these children obtained a 
result that corresponded with their actual age. And 
even in the cases in which the mental ability fell be- 
low the norm, there was no parallelism at all be- 
tween the two kinds of deficiency. ' "" 



"See Reference 54, p. 28. We may mention also in this connec- 
tion the results obtained by Frau Dosai-Rgv6sz (4) with separate 
tests. She compared the efficiency In computation, memory and 
report of normal children, simple feeble-minded and morally feeble- 
minded and found that the results for the last-named group fell 
almost entirely between the results for the two other groups. 
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But this discussion has already led us from the 
consideration of mental arrest to the question of 
the mental retardation of the feeble-minded. Binet 
used as the measure of retardation simply the dif- 
ference between mental age and chronological age 
and was so convinced of the general application of 
this measure that he looked upon the value "2 
years" as a general expression for a definite and in 
fact serious deficiency. 

Binet 's successors also made use of this standard, 
but their own results teach us that we can not be 
satisfied with it. For it has become evident that one 
mid the same absolute difference, e. g., a mental re- 
tardation of three years, means very different things 
at different years. Thus Kramer (54) remarks : "It 
should not be concluded that a 12-year old child with 
a mental age of 9 is of the same degree of feeble- 
mindedness as an 8-year^^ child with a mental age 
of 5. In the case of the children turned over to 
us for examination by the Central Child Welfare 
Bureau (JugendfUrsorgezentrale) it came out clearly 
that the differences revealed among the younger 
children were for the most part but small, but 
among the older children always greater, although 
the actual defects in these two groups, so far as 
we could judge them by other criteria, by no means 
revealed any corresponding difference, but seemed, 
on the average, to be about the same." Chot- 
zen (44, p. 493) also corroborates this view: "On 
account of a checking of development, the mental age 

"^Page 29. In the text there is a typographical error here, 7 in- 
stead of 8-year. 



THE METHOD OP AGE GRADATION 77 

of feeble-minded children lags progressively more 
and more behind their chronological age: the 
younger they are, the more, and the older they are, 
the less does a year's retardation mean in actual de- 
fectiveness." 

How considerable the fluctuations are may be 
shown by some figures (Table X) that I have de- 
rived from one of Chotzen's tables (p. 485). In ad- 
dition to the tests, or rather independently of them, 
Chotzen examined all the pupils of the special school 
as the physician and the psychiatrist ordinarily 
would, and classified them, on the basis of this ex- 
amination, into the stock groups — moron, imbecile, 
idiot. Some he had to class outside of these groups 
by designating them as 'not feeble-minded' or as 
'doubtful feeble-minded.' Now, it might be sup- 
posed that the members of any group, e. g., the 
morons, would necessarily show at least approxi- 
mately the same degree of mental endowment, re- 
gardless of differences in their chronological ages. 
But Table X shows that their mental retardation, 
computed as the absolute difference, has very dif- 
ferent values with the older than with the younger 
children, and Table XI, in which the average value 
of this measure of retardation has been figured for 
each age-level, reveals a rapid increase in the mag- 
nitude of the value, so that the 12-year-old imbeciles 
are retarded by twice as many years as the 8-year- 
old imbeciles (4.7 as against 2.3 years). 

From this it seems to me to follow that the abso- 
lute difference can be used only when we are dealing 
with children of a given age. If, for example, it 
should sometime be arranged to carry out tests of 
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intelligence upon all 6-year-old children wlien they 
entered school, then the designations "retarded one 
year" or "advanced one year" would have an un- 
equivocal meaning. 



FREQUENCY OP MENTAL BETAKDATION IN DIFPEEENT POEMS OP FEEBLE- 
MINDEDNESS AND DIPFEBENT CHRONOLOGICAL AGES 



Retardation. . . 




^Not Feeble-minded--, 
1 Tr. 2 Yrs. 3 Yrs. 


^Doubtfully Defective-, 
lYr. 2 Yrs. 3 Yrs. 4 Yrs. 


Chronological 
Age. 


f 8 
9 
10 
11 
12 
13 


6 


11 

7 


5 
2 

1 
1 


13 
1 


4 
3 
3 

1 1 

1 

1 1 

— Imbeciles , 

2 3 4 5 
Yrs. Yrs. Yrs. Yrs. 


Retardation. . . 






1 
Yr. 


2 3 4 
Yrs. Yrs. Yrs. 


1 
Yr. 


Chronological 
Age. 


. 


8 

9 

10 

11 

12 


4 


10 2 

15 2 
5 7 

5 1 
2 1 


6 


21 9 2 1 

8 30 8 2 

7 5 4 

2 

1 2 



But it is another matter when we have to consider 
children of quite different ages or when we want to 
express the degree of backwardness in a formula of 
general validity. The value based on absolute dif- 
ference, if given by itself, may mean very different 
things, so that at least the chronological age ought 
always to be stated to enable the reader to figure out 
the degree of importance to be attached to the dif- 
ference. To what prolixity of statement this 
method leads one may be illustrated by the follow- 
ing sentence from Chotzen (pp. 493-4) : "Children 
of 8 to 9 years can suffer a deficiency of one year, 
those of 10 to 12 years one of two years without 



jfect 


Morons 


Imbeciles 


1.3 


1.9 


2.3 


1.7 


2.1 


3.1 


2.0 


2.6 


3.8 


3.5 


3.2 


4.0 


3.0 


3.3 


4.7 


3.5 
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feeble-mindedness being present, but a backwardness 
of two, or of three years, respectively, for these ages, 
certainly cannot coexist with normal intelligence." 

TABLE XI 
AVEBAGE EETAEDATION, IN TEARS, OF THE CHBLDEEN IN TABLE X 

Chronological Not Doubtful 

^ Age Feeble-minded Defect 

8 0.65 

9 1.4 

10 2.0 

11 3.0 

12 2.0 
13 

That the size of the absolute difference for the 
same degree of feeble-mindedness should increase 
as age increases is psychologically easily intelligible, 
for, since feeble-mindedness consists essentially in 
a condition of development that is' below the normal 
condition, the rate of development will also be a 
slower one, and thus every added year of age must 
magnify the difference in question, at least as long 
as there is present anything that could be called 
mental development at all. With this in mind it is 
but a step to the idea of measuring the backward- 
ness by the relative difference, i. e., by the ratio be- 
tween mental and chronological age, instead of by 
the absolute difference. Bobertag had already con- 
ceived a plan of this sort, while Kramer (54, p. 30) 
hints at something of the sort, though very guard- 
edly: "Whether perhaps there might be devised a 
specific method of calculation for relating the dif- 
ference in years to chronological age and which 
would then give us an absolute measure for degree 
of feeble-mindedness, seems to me a matter of 
doubt." 
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The results of Chotzen that now lie before us per- 
mit us to test the feasibility of a relative measure of 
this sort. I should like to recommend the relating 
to chronological age not of the difference, but of the 
mental age itself. We would then obtain the mental 
quotient that has already been mentioned (p. 42). 
This quotient shoivs what fractional part of the in- 
telligence normal to his age a feehle-minded child 
attains. Mental quotient = mental age -H chrono- 
logical age. An 8-year old child with a mental age 
of six has, then, a mental quotient of 6/8 = 0.75. A 
12-year old child with a mental age of 9 has the same 
mental quotient. 

TABLE Xn 
AVEEAGE MENTAIi QtTOTIENT OF THE CHILDEEN IN TABLES X AND XI 



)nologi( 


eal Not 


Doubtful 






Age 


Feeble-minded 


Defect 


Morons 


Imbeciles 


8 


0.92 


0.84 


0.76 


0.71 


9 


0.85 


0.81 


0.77 


0.67 


10 


(0.80) 


(0.80) 


0.74 


0.62 


11 


(0.73) 


(0.68) 


0.71 


(0.64) 


12 


(0.75) 


(0.75) 


(0.73) 


(0.61) 


13 




(0.73) 







Now when we turn into quotients the values cal- 
culated from Chotzen in Table XI, we obtain Table 
XII. The idiots have been omitted for reasons that 
will appear later. The figures in brackets are those 
that cannot be deemed reliable averages on account 
of too few individuals included in them. The table 
reveals mental quotients for the two main forms of 
feeble-mindedness that are, it is true, not constant, 
but that are, however, very similar through several 
chronological years. The morons, in especial, show 
surprisingly uniform values ; their average quotient 
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varies only within the narrow range 0.71 to 0.77 for 
the five years 8 to 12. Eoughly expressed, therefore, 
their intelligence, measured by that of normal per- 
sons, is a 'three-quarter intelligence.' The imbeciles 
show somewhat greater variations, but their mental 
quotients are in quite fair agreement, at least for 
the years 9 to 11. They entitle their possessors, 
again roughly speaking, to a scant 'two-thirds intel- 
ligence. ' 

The first two of Chotzen's groups are represented 
by too few cases to permit consideration of their 
averages, save at most for the younger ages. In 
these ages the mental quotient agrees finely with the 
medical diagnosis of the children. Those desig- 
nated as "not feeble-minded" have a mental quotient 
of about 0.90, while the doubtfully-defective, whose 
quotient lies between 0.80 and 0.84, form a real in- 
termediate grade between the 'not-abnormal' and 
the true morons. The isolated cases of older chil- 
dren (7 in all) that Chotzen classified in these two 
groups, are ranked by their quotient largely in the 
morons. It is possible that the mental quotient may 
supplement uncertain medical diagnoses in cases of 
this sort. 

Now the objection might be raised to the above 
series of quotients that they comprise only averages 
and that these have been derived in part from a too 
small number of values. To meet this objection I 
have made another computation in which I have 
worked out the mental quotients of individual chil- 
dren, and then have recorded their frequency-dis- 
tribution. In this computation I have disregarded 
chronological age, and have combined in each case 
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the values for 10 points on the scale, e. g., the quo- 
tients lying between 1.00 and 91, between 0.81 and 
0.90, etc. 





TABLE XHI 
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14 48 
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13 45 


37 
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Totals. 



33 



100 



29 100 



55 



100 



111 



100 



Table XIII shows the distribution of the results 
obtained in this way, both in absolute numbers and 
in percentages : Figure 1 also shows the distribution 
of the percentages graphically. 




FIG. 1. 



DISTRIBUTION OF MENTAL QUOTIENTS DERIVED FROM 
CHOTZBN'S RESULTS. 

= not feeble-minded. 

= doubtful. 

—.—.—.—.—.—. = moron. 
= imbecile. 
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There appears a clear separation of the points of 
maximal frequency for the chief groups, and, it is 
to be noted, the mental quotient of the 'not-abnormal' 
children lies mostly between 0.81 and 0.90, that of 
the morons between 0.71 and 0.80, that of the im- 
beciles between 0.61 and 0.70 — all quite in accord- 
ance with our earlier figures. In the case of the im- 
beciles the range of the quotients is wider than with 
the other groups, as the average values had already 
shown. Attention should be called to the fairly 
symmetrical form of the three curves : this brings it 
about that the point of maximal frequency and the 
average tend to coincide within each group. 

The transitional character of the group of doubt- 
fully defective also finds expression typically in that 
its members are distributed fairly uniformly over 
the regions that are characteristic on the one hand 
of the normals and on the other hand of the un- 
doubted morons. 

The number of the children tested by Chotzen is 
not yet large enough and particularly their distri- 
bution over the different age-levels is not wide 
enough to consider the above figures as having con- 
clusive value for other sets of material, yet they do 
seem to me so far removed from objection as to 
demonstrate that the mental quotient is a very much 
more useful measure of backwardness than the com- 
monly used absolute diiference. 

The quotient does not seem, however, to afford an 
actually constant expression of degree of feeble- . 
mindedness, but shows a tendency to fall in value as 
age increases. This tendency, it is evident, is but 
gligfit withTnllie limits of age that have been men- 
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tioned, so that for many problems it can be neglected. 
Before and after these ages the fall in the value 
seems to take place more rapidly. In the case of the 
later age-levels this is easily intelligible, for once 
the stage of arrest that we have previously dis- 
cussed is reached (for morons at the mental age of 
9), the quotient obtained by dividing mental by 
chronological age must decrease as chronological 
age increases. The feeble-minded child, it must be 
remembered, not only has a slower rate of develop- 
ment than the normal child, but also reaches a stage 
of arrest at an age when the normal child's intelli- 
gence is still pushing forward in its development. 
At this time, then, the cleft between the two will be 
markedly widened. 

From these considerations it follows that the 
mental quotient can hold good as an index of feeble- 
mindedness only during that period when the de- 
velopment of the feeble-minded individual is still in 
progress. It is for this reason that there is no sense 
in calculating the quotient for idiots, because, in 
their case, the stage of arrested development has 
been entered upon long before the ages at which 
they are being subjected to examination. The above- 
mentioned gradual tendency of the mental quotient 
to sink during the progress of development shows 
that this development approaches the final level of 
arrest at a progressively decreasing rate.^^ 

Whether we shall succeed some time in finding a 
formula for a truely constant coefficient of feeble- 
mindedness must be left for the future. 



^In his last article (40, II) Bobertag lays special stress on this 
progressive retardation in the rate of development of the feeble- 
minded and attempts to present it in graphic form. 



THE METHOD OF AGE GRADATION \ 85 

(b) Relation to the several tests. It must not be 
thought that the significance of the Binet-Simon 
method for the study of feeble-mindedness is re- 
stricted to the possibility of grading them quanti- 
tatively. Perhaps even more important than this 
is the qualitative analysis of the individual subject 
that the method allows and the discovery of how the 
several tests have participated in the final values. 
Chotzen's investigation, the first to attack this prob- 
lem, has shown how confusingly many special prob- 
lems and matters of interest are to be unearthed in 
this field. 

At the very outset, for example, there is thrust 
insistently upon us the question : Have we any right 
at all to equate a 10-year-old feeble-minded child 
with a 7-year-old normal child just because the re- 
sult of testing gives him a mental age of 7 years: 
in other words, can we say that feeble-mindedness 
is actually mere 'backwardness.' It is, indeed, quite 
often asserted that this expression is misleading 
because feeble-mindedness is something qualita- 
tively different from normality. But the Binet- 
Simon method makes it possible for us to work out 
the comparison between the two mental conditions 
exactly. 

And in fact comparison does show that the mental 
age of 7 years is not reached in the tests in quite the 
same way that the normal 7-year-old child reaches 
the same mental age, for the area of irregular dis- 
tribution is very much wider with the feeHe-minded 
than with the normal child. Bobertag, in an as yet 
unpublished discussion, reckons the distribution at 
twice the area of that of a normal child. In other 
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words we may say that the 'hits' and 'misses' of the 
older feeble-minded children are scattered over very 
many more age-levels than are those of younger 
normal children : the defective fails unexpectedly to 
pass some quite easy tests, but succeeds here and 
there in meeting much higher requirements. There 
appears a certain dissociation of abilities that are 
normally more strictly intercorrelated. 

We are now in a position, moreover, to discover 
a general principle obtaining in this dissociation. 
There are certain abilities that are essentially a 
function of age, relatively independent of intelli- 
gence: there are other abilities that are conditioned 
entirely by specific degrees of intellectual develop- 
ment, regardless of the age at which this develop- 
ment is attained. A child of 9 or 10 years of age, 
even if he be defective, will be farther advanced than 
a normal child of 6 or 7 years of age in abilities of 
the first sort; but a normal child necessarily sur- 
passes a feeble-minded child in abilities of the second 
sort. 

A priori we should expect that to the first sort of 
abilities (those conditioned by age) would belong 
those dependent upon a mass of experiences fre- 
quently had and activities frequently discharged in 
everyday life. But a priori opinions of this sort are 
of no great service to us, and it will be of corre- 
spondingly great value for us to be able to discover 
by an analysis of the results of Binet-Simon tests 
which of the tests applied to the feeble-minded cor- 
relate more with age and which more with real in- 
telligence. Up to now the results of Chotzen are 
alone available for this purpose and even they af- 
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ford but an incomplete survey because Chotzen had 
to deal almost entirely with feeble-minded children 
of a single age-group (8 to 9 years). 

Chotzen gives us a whole series of computations 
to show the worth of the different tests for the diag- 
nosis of feeble-mindedness : the perusal of his diffi- 
cult exposition will afford the reader a new idea of 
the complications that arise when one really tries to 
analyze the serial system of tests to the last details. 
Because a repetition of investigations of this sort, 
especially with feeble-minded children of more ad- 
vanced ages is very much to be desired, we feel war- 
ranted in introducing here a brief account of the 
methods that Chotzen pursued in evaluating the 
tests. 

The simplest thing is, of course, the direct com- 
parison of feeble-minded with normal children of 
the same age (using Bobertag's data). 

From such a comparison it appeared (44, p. 440) that the back- 
wardness of the feeble-minded was least in the following tests : 
telling forenoon from afternoon, defining in terms of use, knowing 
own age, esthetic judgment, telling the number of the fingers, 
describing a picture, counting 13 pennies ; the backwardness was, 
on the other hand, very pronounced in the following : memory-span 
for 16 syllables and for 5 digits, making change (80 Pfennige for 
1 Mark), counting backward from 20 to 0, definition by super- 
ordinate terms, comparison of two objects from memory, recall of 
a short story, naming the months and arranging the five weights. 

With children of other ages these lists would pre- 
sumably change. Thus the explanation of the pic- 
ture which is demanded of older children would 
doubtless bring out a decided difference between 
normal and feeble-minded children, though the de- 
scription of the picture which is demanded of the 
younger children did not bring out such a difference, 
according to Chotzen. 
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However, even these lists of Chotzen's suffice to 
show that the differences between the two types of 
children turn out to be small in those tests that re- 
late to frequently practiced activities (counting, tell- 
ing how old they are) and to common experiences of 
everyday life (number of fingers, forenoon and 
afternoon) ; on the other hand, the deficiency of the 
feeble-minded is at once revealed in its entirety 
the moment that something unusual is demanded, 
that something new is presented and that attention 
must be sharply concentrated. 

A similar comparison can be carried out, in the 
next place, amongst the special-class pupils them- 
selves, i. e., between the different groups of feeble- 
minded that the medical diagnosis had established: 
Chotzen found out which tests exhibited a specially 
decided drop from one group to another in the 
feeble-minded children of the same age. I mention 
only those that showed a clear falling off of one-half 
in passing from the "not feeble-minded" to the 
morons and from the morons to the imbeciles. 

For 8- and 9-year-ol<l children : drawing a diamond, repeating 
five digits, easy problem-questions. There was a somewhat smaller 
falling off in counting five coins and comparing two objects. 

For older children (Chotzen had also tested a series of older 
children for purposes of comparison) : comparison, reproduction of 
the item in the newspaper, arranging five weights, making change, 
defining by superordinate terms, knowing various pieces of money, 
repeating five digits. 

Thirdly and lastly, Chotzen figured out compara- 
tive results for those subjects of the same mental, 
but of different chronological age, as might happen, 
for instance, if an 8-year-old child were retarded 
two years, a 9-year-old child three years, or a 10- 
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year old child four years. He found that when 
children of a single mental level were considered, 
some tests show a clear increase in capacity with in- 
crease in chronological age, others no alteration, 
while yet others an actual decrease. Tests of the 
first sort, those that have an 'age-increase,' are 
doubtless tests that have least to do with intelli- 
gence, because, given the same intelligence, they are 
nevertheless better done by the older children. On 
the contrary, the other tests plainly stand in correla- 
tion with intelligence, more particularly the tests in 
which the older children actually turn out poorer. 
The following are the results : 

Decided increase with age is shown in copying, writing from 
dictation, the recall of two items of a story, naming the days of 
the week. 

"The tests accompanied by strong increase with age relate, then, 
almost exclusively to matters of information, particularly of 
school-information, the assimilation of which depends on the ex- 
tent of instruction. Where only a slight increase is to be detected, 
information also plays a rOle in some of the tests (five coins, 
knowing age), but for the most part the tests are such that not 
only practise, but also the natural increase of efficiency will im- 
prove the results, e. g., execution of three orders, counting back- 
wards, repeating 16 syllables. In all of these the increase with 
age is slight. No increase at all is present with tests that demand 
ability to judge and to combine or with such as put severe de- 
mands upon apprehension — comparison, problem-questions, noting 
omissions, repeating five digits" (44, p. 453). 

To this last category probably belong also: recall of six details 
of a story, arrangement of the five weights, explanation of a pic- 
ture, making change, though the figures are too small in these cases 
to permit positive conclusions. 

"When we compare with each other these different 
lists obtained in different ways, we note, it is true, 
deviations in many details, yet, taken as a whole, the 
same tests keep cropping up as the ones in which 
defective intelligence is laid bare, unconcealed and 
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uncompensated, while in the other tests the defect of 
intelligence can be made good by greater age. 

When investigations of this kind shall have been 
carried out with a large number of feeble-minded 
individuals of different chronological ages, we may 
hope to reach a far deeper insight into the whole 
structure of defective intelligence in its different 
stages of development and degrees of enfeeblement. 

(c) Intelligence and school ability. The problem 
we have already met with the normal children (pp. 
57 ff.) meets us again with the abnormal and leads 
us to quite similar conclusions. That is, only a par- 
tial correspondence exists between the magnitude 
of the mental defect and the reduction in school 
ability. Kramer states that a large number of chil- 
dren were retarded in their school classes by the 
same number of years as they were retarded in in- 
telligence, yet there were a good many who were 
more backward in school status than in intelligence 
(the opposite condition, less backward in the school, 
almost never obtained). In fact, there were some 
children completely incompetent for school work in 
whom no corresponding mental defect could be made 
out. 

Similarly, among the 8- and 9-year-old children 
turned over to the special classes (auxiliary school) 
Chotzen found a large number that did not have the 
two years of backwardness demanded by Binet for 
such a condition, but who, nevertheless, certainly 
belonged ia the special school, because they failed 
completely in the regular school. 

This pedagogical retardation that is non-intel- 
lectually conditioned is, as will be understood, in 
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some cases a product of external conditions, in par- 
ticular of poor home conditions, neglect, change of 
residence and school, long illness, etc. In other 
cases, however, what is lacking is something in- 
ternal: those volitional attributes that must supple- 
ment intelligence to produce useful men are not de- 
veloped to the same degree as the intelligence. 
There are, then, the morally feeble-minded: "chil- 
dren of this type, as one might expect, shirk their 
lessons, are up to all sorts of mischief in the class, 
are quite unaffected by punishments, and so forth, 
so that, despite good intelligence, they more or less 
often fail of promotion. Those cases in which these 
mental anomalies are accompanied by intellectual 
deficiency of a small degree prove to be especially 
unpropitious (Kramer, 54, p. 31). 

5. Points of Vieiv for the Reorgcmisation and Im- 
provement of the Gradation Method ' ^ _j 

(Our discussion has revealed already a series of 
more or less serious defects in the Binet-Simon 
method, nor have these defects been removed by the 
revision made by Binet himself in 1911. Nearly 
every user of the method has called attention to 
weaknesses of some sort in it; moreover, many do 
more than merely criticize ; they make proposals for 
modifying or supplementing the method, or even 
make use themselves without more ado of modified 
methods of conducting the tests at this or that point. 
But it would become a very serious matter if in- 
dividual investigators, on mere grounds of personal 
preference or chance bits of criticism, should be for- 
ever making changes in an instrument of investiga- 
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tion that lias attained international usage ; on tlie one 
hand snch tinkering will destroy the balance of the 
whole system in which every test is peculiarly bound 
up with every other test, and on the other hand it 
will put an end to the comparison of the results of 

tfferent investigators?) 
For these reasons ix is to be recommended that 
we proceed in the future in this way: wherever our 
object is to lay the emphasis on the substance of the 
results secured, as in the testing of children for 
practical purposes, let us for the present still con- 
tinue to use the old system, despite its evident de- 
fects. But independently from this, let investiga- 
tions directed to methodological issues be under- 
taken with the aim of constructing a gradation sys- 
tem that shall be revised in every particular. But 
this task is beyond the ability of the individual in- 
vestigator : the problems to be solved are too many 
and varied and the number of individuals that should 
be tested is too great. Rather is it true that here, if 
anywhere, is there opportunity for that community 
and division of work that is everywhere now de- 
manded in psychology. 

To prepare the way for the carrying out of such a 
program I enumerate here the chief points to be 
considered in this work of reconstruction and also 
offer for discussion some specific proposals of my 
own for modifications in the system. 

(ffl) Selection and appraisement of the various 
tests. The criticism that has been passed upon the 
various tests has been based sometimes on theoreti- 
cal considerations, sometimes on practical results. 
The critique of Ayres (31), who has done no work 
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himself with the method, is an instance of the first 
type. He complains that the tests have too seldom 
a direct relation to practical intelligence, that they 
principally concern such things as fluent use of lan- 
guage, memory span, response to problem questions 
that are quite foreign to real life, and also in part 
attainments that are to a great degree dependent 
on instruction and on influences of the home environ- 
ment, and also work with abstract concepts — some- 
thing, he says, with which only philosophers have to 
deal ( !), whereas they do not touch the ability to get 
on with the activities of life ; he wants more ' ' doing 
tests" introduced. Although Ayres' criticism is jus- 
tified in many respects, yet he seems to have over- 
looked the fundamental fact that intelligence is a 
formal activity, and that of necessity it is operative 
also in tasks whose content is not such as appears in 
real life. Indeed, problems of this sort have the 
methodological advantage that there is certainly no 
uncontrollable influence of training in them. 

More important are the criticisms that proceed 
from the empirical retrial of the tests. It has been 
shown that, as a matter of fact, there is too inti- 
mate dependence with school and environmental in- 
fluences in many of the tests ; others could not be as- 
signed positively to a specific age-level or showed 
no clear differences in the performances of children 
of unmistakably different intelligence. Again, objec- 
tion is to be raised to those tests in which there is a 
strong probability that the right answer may be a 
matter of mere chance, like the tests: "Show me 
your right hand, your left ear" and "Is it forenoon 
or afternoon?" 
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The fitness of a test to be employed at all, and its 
assignment to a given age-level is something that we 
shall be able in the future to work out by different 
methods. 

In the first place we have to make use of the rela- 
tion between the age-level and the test. In general, 
for a test to be valid for a certain level, the require- 
ment is that approximately 75 per cent, of aU chil- 
dren of this age shall be able to pass the test. This 
requirement would correspond to the normal stand- 
ard of validity previously mentioned (p. 45 f.), and 
Bobertag, as more recently Bell (32), has actually 
checked up the assignment of given tests to given 
age-levels in accordance with this principle ; Terman 
and Childs (63) take 66 per cent, for the critical 
value, though this would seem, for the reasons al- 
ready cited, to be less appropriate. 

Taken alone, however, this principle is inadequate, 
for it does not inform us whether the test would be 
characteristic for just this age-level only and not 
just as much or nearly as much for another age-level. 
To determine this we must discover with what fre- 
quency the test is passed in other ages; and that 
test is most useful that shows the most decided ad- 
vance with age (a helpful methodological device to 
which Bobertag was the first to call attention). 

For the sake of illustration let us invent an example. Suppose 
two tests have each been passed successfully by 75 per cent, of 
9-year-old children, but that the one test shows little, the other 
decided difference in the frequency with which it is passed by 
8- and 10-year old children. If Test A be passed by 65, 75 and oO 
per cent, of 8-, 9- and 10-year old children, respectively, and Test B 
by 45, 75 and 90 per cent, in the same three ages, respectively, then 
the latter test sets a task whose performance is just normal for 
the 9-year-old, as compared with the 8-year-old children, and prac- 
tically a self-evident activity for 10-year-old children; Test B is 
then tbe more useful test 
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Since the process of mental development brings 
into maturity in succession a series of different part- 
functions, it follows that for each age there should 
be a series of tests to correspond to the phenomena 
of development that have just appeared; it must be 
possible, with the aid of the principle of decided ad- 
vance with age, to pick these tests out from a num- 
ber of others. 

Again, the matter of correspondence between the 
results of different investigations must be consid- 
ered in the selection of the tests. A test that grades 
the same or nearly the same with German, French, 
English and American children has naturally more 
claim to be included in the final system than one that 
varies markedly with the examiner or with the ex- 
aminees. The table that Bell (32) has prepared is 
instructive in this connection. He presents, side by 
side, the age-rank that each of the Binet-Simon 
tests would have attained on the basis of the results 
of Binet, Levistre and Morle, Johnstone, Goddard, 
Bobertag, and Terman and Childs.^' 

In many of the tests the variations are quite large ; thus, the 
test of "comparing two objects from memory" ranges from the 6th 
year (Johnstone) to the 9th year (Terman and Childs), the test 
of "naming 60 words in three minutes" from the 10th year (God- 
dard) to the 15th year (Levistre and MorlS, Terman and Childs). 
The assignment of such a test to a single age-level becomes, then, 
evidently an arbitrary matter. Over against these are other tests 
that show great constancy, at least so far. Thus, "counting 13 
pennies," "esthetic comparison," "showing right hand and left 



^It must be remembered that the tables and materials from 
which Bell had to construct his summary have been assembled so 
differently by the different investigators that their gauging of the 
several tests Is not really directly comparable, so that Bell's tables 
must be regarded merely as a preliminary attempt at checking up 
the results of various investigations. 



96 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE 

ear" fluctuate in rank-assignment only between the 6th and the 
7th years, the "recognition of omissions in drawings" only between 
the 7th and the 8th years, the test of "counting backward from 20" 
only between the 8th and the 9th year, that of "naming the months" 
only between the 9th and the 10th year. The "hard problem-ques- 
tions" test is ranked by all these investigators save Goddard in 
the 12th year, etc. 

It will be seen that these are for the most part 
tests in which verbal formulation plays little or 
no part. It is, indeed, quite natural that where the 
problem and its answer are intimately connected 
with verbal expression, national peculiarities must 
make themselves evident; but it will be possible to 
reduce this source of error if more heed is given in 
the future to the transference of the tests from the 
one language to the other in such a way as to fit as 
exactly as possible the linguistic and cultural tone 
of the second nation and thus secure equal difficulty 
in the problems: the actual verbal translation that 
has been used by many investigators has often failed 
to meet this requirement. Thus, for instance, the 
rather free adaptation of the method that Bobertag 
made for Germany has yielded in many cases re- 
sults in closer accord with those of Binet than have 
the literal translations of the Americans. 

Finally, we shall have also to judge the value of a 
test according as it does succeed in bringing plainly 
to light differences in intelligence that are known 
from other sources to exist. On this point Mile. 
Descoeudres (46) has carried out a study of the 
present Binet tests, though, to be sure, upon but a 
limited number of children. She tested one "intelh- 
gent ' ' and one ' ' unintelligent ' ' child from each of the 
six years of a boys' and of a girls' Volhsschule: the 
selection was determined by the teachers' estimates 
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of the pupils' intelligence. When, now, she com- 
pared the results of the tests for the 12 unintelli- 
gent and the 12 intelligent children, taken as groups, 
she found that the differentiation of the two groups 
appeared with quite unequal clearness in the differ- 
ent tests. Those tests in which the intelligent had 
the clearest advantage over the unintelligent (and 
that therefore have the most claim for consideration 
as tests of intelligence) are cited in the first column 
of Table XIV. 

TABLE XIV 
BINET TESTS WHEREIN A CLEAR DIFFERENCE IS SHOWN 



Between Intelligent 
and Unintelligent 
Normal Children 
(Descoeudres). 



Between Normal and 
Feeble-Minded Chil- 
dren (Chotzen). 



Between Different 
Grades of Feeble- 
Minded Children 
(Descoeudres). 



Arranging 5 weights. 

Definition superior to 
use. 

Counting VacJcward. 

Explanation of pic- 
ture. 

Noting omissions in 
drawings. 

Detecting absurdities. 



5 weights. 

Definition superior to 
use. 

Counting backward. 

Comparison of two 
objects from mem- 
ory. 

Repeating five digits, 
16 syllables, and 
the story. 

Counting coins. 

Making change. 



Definitions. 

Description of pic- 
ture. 

Comparison of two 
objects from mem- 
ory. 

Problem-questions. 



Mile. Descoeudres has also undertaken a study of 
feeble-minded children (73), which may be intro- 
duced here for comparison. (We shall have occa- 
sion to discuss it in more detail later in another con- 
nection.) The children were arranged in order of 
the estimated degree of their feeble-mindedness and 
with this was compared their capacity in 15 differ- 
ent tests. Among these tests were six from the 
Binet-Simon series, four of which yielded extraordi- 
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narily Mgli correlations (between 0.80 and 0.88) 
with the estimated intelligence. These four tests 
are listed in the third column of Table XIV. With 
the tests of "knowing coins" and "naming of 60 
words in three minutes" the correspondence was of 
lesser degree. 

And thirdly, we must call to mind the results of 
Chotzen to which we have already referred (p. 87), 
in which certain tests gave far clearer expression 
than others to the difference between normal and 
feeble-minded children of the same age. These tests 
are listed in the second column of the table. 

It is worth noting that most of the tests appear 
several times in the three columns, despite the fact 
that the three investigations were carried out with 
children of quite different ages and under otherwise 
varying conditions. This shows that certain tests 
are particularly fitted to bring differences in intelli- 
gence out in clear relief, and at the same time it 
shows us a way to pick out these true tests of intelli- 
gence from the rest.^* 

It is unnecessary to add that there is no reason 
why controls like these should be limited only to the 
tests already used by Binet; in fact, comparisons 
between intelligent and unintelligent pupils have al- 
ready been carried out for the most varied sorts of 
tests by Meumann, Winteler, Cohn-Dieffenbacher 
and many foreign investigators. From these and 
other future investigations like them there will surely 

"There is one other point of correspondence that ought to be 
mentioned, viz., that the differentiation of the children according 
to their social status was also revealed for the greater part by 
these same characteristic tests as revealed the intellectual differ- 
entiation (see pp. 53f.). 
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be found certain tests of so decided a symptomatic 
value that they will deserve to be adapted for intro- 
duction into the graded system. In this connection 
we may allude among others to the modifications of 
the Masselon test recently proposed by Meumann 
(v. p. 16) — a test that Meumann believes is usually 
solved with logical insight by the intelligent but not 
so by the unintelligent. 

Investigations in correlational psychology of 
which we shall speak in the next section likewise 
afford many tests whose results exhibit decided cor- 
respondence with estimated intelligence. These 
tests are evidently not such as can be introduced 
directly into the system of graded tests because they 
deal with fine gradation, whereas the Biuet scale 
recognizes only tests that present merely the al- 
ternatives "right" or "wrong." Possibly, how- 
ever, they will admit of rearrangement into a sim- 
pler form appropriate to the scale. 

By using all the methodological resources that we 
have cited we shall gradually succeed in selecting 
tests that are far more characteristic of the intelli- 
gence of a given age-level than those now in use and 
that are homogeneous for the different cultural 
groups and nations to be tested. 

(6) The composition of series for the several 
years. Since intelligence is a formal capacity that 
can be determined only by multiform testing, care 
must be taken that each single age-level should have 
a manifold of tests. It is not enough, therefore, to 
put together any sort of separate tests that happen 
to be passed by 75 per cent, of those of the age-level 
in question. If the tests are too similar to one an- 
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other, their combination does little more that the 
testing by any one of them would do. Buiet and 
Simon did not keep this principle sufficiently in 
mind : some of their age-levels contain only linguistic 
tests and no tests of activity. 

Furthermore, the age-levels, considered as "wholes, 
must also be adjusted, for to demand that particular 
tests be passed and to demand that all five tests of a 
given age-level shall be passed are two entirely dif- 
ferent tilings. This adjustment is rendered more dif- 
ficult by the fact that in computing mental age one 
must not only deal with the tests of one age-level, but 
also make supplementary use of tests from the higher 
levels ; accordingly, in this adjustment of the levels 
as a whole attention must be paid to the interrela- 
tion of tests that come into consideration in connec- 
tion with different near-by age-levels. 

The controlling principle for the adjustment or 
standardization of the age-levels is that approxi- 
mately symmetrical distribution of the mental ages 
must prevail for each level. That is, the tests are 
properly arranged and skillfully assembled into a 
system if, when a large number of unselected normal 
children of a given age are tested, a large middle 
group stand 'at age' and the rest are divided fairly 
equally between advanced and retarded cases. 

To carry out such investigations practically it 
will be necessary to try as many tests as possible 
with each pupil; in this way it will be feasible to 
assign the passing of each particular test to this or 
that age-level and to discover the general arrange- 
ment of the tests that furnishes the closest to a sym- 
metrical distribution. 
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The investigation can be made with, more pre- 
cision if the curve of distribution be based upon the 
mental quotient instead of the mental age, for we 
should then anticipate fairly good correspondence 
between the curves of distribution of the different 
age-levels. The mental quotients for each age-level 
would then be grouped together in 10 per cent, 
ranges, i. e., we should have first the children with 
mental quotients ranging from 0.91 to 1.00 and 1.00 
to 1.10 that would form the compact middle group, 
then on either side of them groups of rapidly dimin- 
ishing frequencies, those with mental quotients .81 
to .90, .71 to .80, etc., below, and those with quotients 
1.11 to 1.20, 1.21 to 1.30, etc., above. 

In the older form of the Binet-Simon scale the 
number of tests assigned to each year differed. In 
1911 Binet put five tests in every age; it is to be 
recommended that this idea of uniformity be fol- 
lowed in the future because the computation of the 
final status is much simplified in that way.^° 

(c) The extension of the system. In the next 
place the system of tests is to be extended beyond 
its present limits and in different directions. 

Thus far the lack of tests has been most seriously , 
felt in the upper years. The tests that Binet and 
others have devised above the 11th year have been 
thus far quite tentative and provisional ; at the best 
they could furnish us the necessary supplementary 
material for the ascertainment of mental ages 10 
and 11, but they absolutely fail to provide a direct 

=»Cf. also the provisional new arrangement of Bobertag In Ap- 
pendix II. 
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measurement for the mental ages 12 to 15. We must 
admit that the discovery of appropriate tests for 
these higher levels of mental maturity is much more 
difficult than for the younger children, but the diffi- 
culty is to be overcome. Thus, Terman and Childs 
(64) have recently proposed a series of tests, each 
one of which is susceptible to diverse gradings with 
respect to the capacities that it requires, so that it 
can be employed up to mental age 15. Among these 
tests are arithmetical reasoning, familiarity with a 
list of selected words, a generalization test (discov- 
ering the 'moral' of a fable that is read to the sub- 
ject) and the Ebbinghaus completion test with the 
task made progressively more difficult.^" 

Let us hope that in such a way we may gradually 
advance from one year to another and may finally 
create a series for adults as the termination of the 
whole scale. However, this problem is certainly not 
so easy of solution as Binet thought when he trans- 
ferred to higher ages tests that he had originally de- 
veloped for the years 11, 12 and 13, and made the 
last of these groups over into tests for ' ' adults ' ' by 
the addition of two new ones. 

Another thing that is greatly to be desired is an 
extension of the system by the creation of parallel 
series of tests for each year.^' How gladly would 
we use the method to trace the mental development 
of the same children through several years; but 
there are difficulties in the way of this, because, of 
course, when the same tests are repeated, the child 

"See Appendix II. 
"Cf. Binet, 36, p. 163. 
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confronts them with a different attitude.^* But if we 
had at our disposal other equivalent series of tests, it 
would be possible to make repeated testings of the 
same individuals more frequently. In the same way, 
when group tests were carried on, those children 
between whom there might be danger of collusion 
could be tested with different series. Finally, it is 
valuable to have a supplementary series at hand in 
case an investigation is rendered worthless by dis- 
turbance or ineptitude of any sort. 

When we shall have imdertaken simply those try- 
outs of a considerable number of single tests sug- 
gested above (cf. pp. 92 ff.), we shall certainly have 
enough at our command to arrange parallel series 
for each year : though there will be some difficulty in 
securing an approximate equivalence between the 
corresponding scales. 

There has been some demand for yet another kind 
of extension of the scale. As it has appeared that 
the mental differences are extreme between one year 
and another in the case of the younger children, the 
need has been felt of intermediate stages, as for in- 
stance for specific standards for such age-levels as 
6.5, 7.5 and 8.5 years. In our opinion this need is 
to be satisfied in another way, viz., by use of the 
mental quotient, since this permits us to take frac- 

"'Binet had five 9-year-old children tested twice with the same 
tests with a 14-day interval. On the average, the children passed 
2.5 tests more on the second trial — an amount that would signify 
an increase of a half-year in mental age (pp. 164-5). As Bobertag 
has shown, the danger resident in repetition is not so great as this 
when the interval is longer (see above, p. 69) ; yet even under 
these conditions the use of the same tests is but a make-shift and 
a second and a third repetition of the same tests would be surely 
quite out of the question. 
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tions of ages into account without special half-year 
steps (see p. 105, below). 

Fuially, mention may be made of other desires, 
curae posteriores: differentiation of the scales for 
children of different social strata, for the two sexes, 
and especialty series devoid of the speech factor for 
the testing of the deaf and dumb, etc. 

(d) The computation of the final values. There 
are two main difficulties that demand our attention 
here. 

The one consists in the limitation of the measures 
of the mental age and the chronological age to whole 
numbers. This necessity of using whole numbers 
must often entail an arbitrariness that renders im- 
possible the carrying out of the method with pre- 
cision. 

For instance, a child wlio, when tested, lacks four months of 
completing his eighth year of life, must of necessity be classed as 
"8 years." If he passes the 7-year-old tests and two more, he still 
receives the mental age of 7 years, and is, accordingly, credited 
with a mental retardation of one year, although in realily there is 
practically no retardation at all. 

Bobertag^^ tried to circumvent this difficulty by 
taking for his testing only those children that were 
close to their birthday (at least within 2 months). 
But usually there is no chance for free choice like 
this : there a,re certain children to be tested, regard- 
less of what their age happens to be at the time. Be- 
sides, that kind of selection at most only lessens the 
difficulty for the chronological age, not for the mental 
age. The failure to consider the two or three ex- 
cess tests passed still remains as a defect in the cal- 
culations. 



• 40, I, p. 110. 
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Hence, however enticingly simple they may be, we 
shall have to give up the use of the rough whole-year 
designations, like 1, 2, 3 years of retardation, and 
make use of fractional values: it is enough, of 
course, to carry them to the first decimal place. In 
figuring mental age each single test passed in excess 
must, then, represent a fraction of a year. If, for 
example, two of the five 8-year tests are passed, then 
2/5 is to be added to the mental age; the child in 
our example just above would then have obtained a 
mental age of 7.4 years. Terman and Childs (64) 
are already making use of such a mode of calcula- 
tion, only theirs is made rather awkward by the pres- 
ence of different fractional values in the several 
years : when the year contains seven tests, each test 
has only the value one-seventh, when five tests, one- 
fifth. This feature, too, confirms our desideratum al- 
ready expressed that every one of the years should 
contain just five tests, then each test would have the 
same value, 0.2 of its year. 

But now, once the use of the convenient whole 
numbers be given up, every objection against the in- 
troduction of the mental quotient is removed, for this 
furnishes us a single fractional value in place of the 
two fractional values, chronological age and mental 
age. This quotient lies for normal children in the 
neighborhood of 1.00 and grades off continuously 
from this value in both directions. As compared 
with the older method of di-\dding by the rough units 
of the age-levels, the use of the quotient has surely 
the advantage of affording a certain smoothness and 
continuity in the results, since the fraction (mental 
age divided by chronological age), when the deci- 
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mals are used in each term, may assume any value 
whatever. Thus the mental quotient becomes not 
only a useful methodological device for the testing 
of abnormal children, but also a device to be recom- 
mended for use with normal individuals. We have 
already mentioned (p. 101) a case illustrative of its 
application. 

The other difficulty of calculation pertains to the 
way in which scattered distribution of tests passed 
is handled in figuring mental age. As is well known, 
five scattered tests must be passed in order to add 
one year to the mental age, but no attention is then 
paid to the years in which these additional tests lie. 
Let us compare the two hypothetical examples which 
follow : 



All tests through the 6th year are passed : hence the basis 

for computation is a mental level of 6 years 

also passed in Age 7 two tests ] 
also passed in Age 8 three tests I 
also passed in Age 9 three tests f 
also passed In Age 10 two tests J 

total of 10 tests = 2 years 



Resulting mental level 8 years 



All tests through the 6th year are passed : hence the basis 

for computation is a mental level of 6 years 

also passed in Age 7 three tests "1 
also passed In Age 8 five tests I 
also passed in Age 9 two tests [ 
also passed in Age 10 no tests J 

total of 10 tests = 2 years 



Resulting mental level 8 years 

There seems no justification for equating these 
two children, because the first one really stands de- 
cidedly higher mentally by dint of his conspicuously 
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good capacities in the higher levels. To be correct, 
we must credit more difficult tests (those lying m 
higher levels) with a larger fractional value than 
the tests normal to the age in question when we fig- 
ure in these higher tests for addition to a lower age- 
level. We may propose a method of calculation for 
this purpose, that is not too complicated and that, 
like the mental quotient, takes account of the rela- 
tion of the several years to each other. A test from 
a higher level used to supplement a failure in a lower 
level shall be counted not merely as one test, but as 
a quotient of the two years in question. 

In our example just given, then, Child A would be figured out 
thus : basal point, mental age of 6 years ; the tests from the four 
following years would be counted in this way : 

Level 7 is formed by 2 tests from the 7th year (each of these 
counted therefore as "1 test") and by 3 tests from the 8th year, 
each of which are to be counted as 8/7 test) . The 8th year would 
be formed by 3 tests from the 9th year (each counting ^9/8 test) 
and 2 tests from the 10th year (each counting 10/8 test). We get, 
therefore, as total additional credits : 

2x1=2 tests 



3 X 


8 

7 "" 


3.4 tests 


3 X 


00 1 to 

II 


3.4 tests 


2 X 


10 
8 ~~ 


2.5 tests 




11.3 tests 



Since every 5 tests are worth one mental year, the above value 
indicates a supplement of 11.3 -^ 5 = 2.3 mental years : so Child A 
gets a mental age of 8.3 years. 

With Child B it works out thus : the 2 tests from age 9 serve to 
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supplement age 7, and therefore have the value 9/7 each, the re- 
maining 3 tests in the 7th year and likewise the 5 tests in the 8th 
year count for their own level, and thus figure 1 each. 

3x1=3 tests 

9 
2 X — = 2.6 tests 
7 

5x1 =5 tests 

10.6 tests 

These 10.6 tests indicate a supplement of 2.1 years ; so Child B 
gets a mental age of 8.1 years and his inferiority to Child A is 
now brought out statistically. 



III. Estimation and Testing of Finer Gradations of 
Intelligence 

(With the aid of the method of ranks) 

1. The Problem 

The dififerent degrees of intelligence that are re- 
vealed by the Binet method are relatively gross: 
within any one of its age-levels there are possible 
other and very much finer gradations that escape 
detection by its tests. Yet these very differences are 
often enough just the ones of consequence, particu- 
larly whenever we are dealing with the members or 
a relatively homogeneous group. If, for instance, 
we are comparing the pupils of a school grade that 
are of approximately the same age and of corre- 
sponding school training, these pupils fall mostly 
into the same mental level according to the Binet- 
Simon tests, yet they occupy a finely graded scale of 
ranks within this level. Hence, the question what 
place a pupil occupies among those of his age or of 
his class in respect to intelligence must be answered 
by other methods that seek to establish a rank-order 
of the individuals concerned. 

Eank-orders of the pupils of a class can be estab- 
lished in quite different ways. In the first place 
there is the school or pedagogical rank-order that is 
based on school performances. Thus we number the 
pupils according to the outcome of a school exercise : 

109 
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the departmental teacher ranks them at the end of 
the term on the totality of their work in his subject : 
finally, all these ranks in different subjects are com- 
bined into a rank-order for the school certificate in 
which every pupil is assigned his "class-place." 

Since these rank-orders are always available, it is 
but natural to employ them for our problem, and 
this is what actually happened very often in the 
early stages of the experimental study of intelli- 
gence. Thus, for example, Ebbinghaus (5) divided 
his subjects into three sections on the basis of their 
class-marks in order to determine whether the dif- 
ferent groups responded with different degrees of 
success to his completion tests. Other investigators 
had the teachers select a number of 'good' and 'poor* 
scholars in order to make comparison of their be- 
havior under experimentation. 

Yet, however convenient this ever-ready classifica- 
tion may be, it is not at all adequate for our pur- 
poses, because the implicit assumption that under- 
lies such uses of the class-marks — ^that school per- 
formance is an absolutely accurate indication of in- 
telligence — ^is unjustified. The results obtained by 
the Binet-Simon method have already shown this 
( see p. 59 ) , and other statistical data will confirm it. 
As a matter of fact, every school man who is blessed 
with psychological insight, knows it himself. 

We need, then, a rank-order that is based directly 
upon the degree of intelligence of the pupils. 

Such an order does not exist in the ordinary school 
system, and must therefore first be created ad hoc. 
There are available, again, two different ways of ac- 
complishing this aim : either the teacher, on the basis 
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of everything that he knows about his pupils, may 
estimate their intelligence and arrange them ac- 
cording to his estimate (see the next section) or we 
can apply experimental tests of intelligence, the out- 
come of which admits of arranging the pupils in a 
series. Eank-orders of intelligence are therefore 
divided into orders based on estimates and orders 
based on tests. 

In the last resort the second of these divisions 
brings us to the question : 7s it possible, on the basis 
of a short examination with a series of tests to arrive 
at a gradation of pupils that corresponds with their 
actual differences of intelligence and such that the 
rank that each gains is sufficiently characteristic of 
his grade of intelligence within the group? 

It is not hard to obtain a rank-order on the basis 
of a test or of a series of tests. To be sure, the Binet 
tests, most of which admit of a choice between the 
evaluations ' right ' or ' wrong, ' are not fitted for that 
purpose, but we can obtain a gradation in all those 
tests that bring into operation a measurable per- 
formance, in which what is measured is the quantity 
of the performance in a given time or the quality of 
the performance (as indicated by the number of er- 
rors). Every test of this sort creates an array of 
ranks, only it remains to discover how much the ob- 
tained order may inform us about the intelligence of 
the examinees. 

Hence we need here, too, a device for guaging our 
work, and this device consists in the comparison of 
several rank-orders obtained with the same individ- 
uals by means of the method of correlation. 

The method of calculating correlation can not, of 
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course, be developed here.^ We merely point out 
here that a correlation = 1.00 means that there is 
complete correspondence between the two rank-or- 
ders (or groups). A correlation = means com- 
plete absence of correspondence. The size of the 
decimal fraction between and 1, then, shows the 
degree of correspondence. The probable error 
(P. E.) is a measure of reliability: only when the 
correlation amounts to at least three times the size 
of its probable error is a real significance to be 
ascribed to it. In details the methods used by differ- 
ent investigators for calculating the coefficient of cor- 
relation show considerable differences. The appen- 
dix contains an example of the simplest method of 
calculating rank-correlation as I have used it. 

All the different rank-orders that have been named 
can be brought into correlation with one another : of 
these possible correlations we shall have to deal for 
our purposes with : correlation of test with test, cor- 
relation of estimates with school performance, cor- 
relation of tests with estimates. 

Correlations of tests with tests have been worked 
out in particular by Spearman (77, 79, 80). If the 
pupils of a class have been tested by means of sev- 
eral different tests and the resulting rank-orders 
show mutual high correlation, this is, in Spearman's 
opinion, a sign that the capacity operative in the 
tests depends upon a common factor (Spearman 
uses the expression "general intelligence" or "gen- 
eral ability"). This brings it about that Pupil A 

^A general exposition of these methods will be found in Betz 
(70) and Stern (1, chs. 19 and 20). [Also in the translator's 
Manual of Mental and Physical Tests, Ch. 3.] 
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ranks high, in the discrimination of line-lengths, in 
memory for nonsense syllables, etc., while Pupil Z 
ranks low in each of these functions. For, if the re- 
sult were conditioned by specific abilities, a given 
pupil would occupy very different ranks in the dif- 
ferent tests. Spearman has, indeed, found ex- 
traordinarily high correlations in some instances: 
on this account he holds it to be demonstrated that 
there really is such a thing as general intelligence, 
and that its grade can be experimentally determined 
by tests that correlate to a high degree with one an- 
other. 

We can agree with the first of these conclusions. 
We have already alluded frequently in what has 
gone before to the 'general' and 'formal' character 
of intelligence, whose influence is operative in activi- 
ties of widely differing character. Of course, we 
must admit that this influence of intelligence is 
never more than approximately uniform, that within 
the "general intelligence" of every person there 
exist certain specially strong and certain specially 
weak points, so that a truer picture of the total in- 
telligence of the individual is given by the idea of a 
mutual balancing or compensation of different capac- 
ities than by the idea of their equality or corre- 
spondence. 

But just here does the value of Spearman's 
method for the testing of intelligence become dubi- 
ous. If we select four or five tests that show very 
high intercorrelations in order to use their totals as 
a measure of intelligence, there exists the danger 
that we may be testing by them only a very restricted 
portion of the field of intelligence and leaving en- 
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tirely out of consideration other compensatorily im- 
portant portions. If, on the other hand, we keep the 
idea of compensation in mind and therefore assemble 
together tests that correlate only moderately with 
one another, we have no way of knowing which com- 
bination of tests does actually secure that mutual 
balance on the basis of which as a whole a more cor- 
rect quantitative expression of intelligence is to be 
yielded. 

To the first of these points it can of course be re- 
plied: we can select tests that show a high correla- 
tion from such divergent realms of mental life that 
we shall avoid the danger of testing only a very re- 
stricted portion of the field of intelligence. But, in 
opposition to this, attention must be directed to a 
matter that has not been sufficiently heeded hereto- 
fore. All test investigations, even if they pertain to 
the most widely different realms of mental life, have 
this in common that they are experimental modes of 
procedure, so that all put the examinee into the same 
mental condition — that of being a subject. And this 
involves not only very definite adjustments of atten- 
tion, of mood, etc., but also in particular the habit of 
mere reacting, of taking at a given moment a re- 
ceptive attitude toward a problem set from without. 
Consequently, spontaneous intelligence is excluded 
by the conditions of the experiment ; there is not in- 
volved that intelligence that sets its own problems, 
that thinks out things independently beyond what is 
immediately given, that anticipates explanations by 
questions, and that in the real situations of life 
quickly works out the best way of confronting a sit- 
uation. And we have absolutely no way of knowing 
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whetlier we can infer from the reactive intelligence, 
however many tests of it we make, to this spontane- 
ous intelligence. It is possible that certain tests or 
certain combinations of tests have a fairly high cor- 
relation with the spontaneous intelligence, but that 
is something that can not be determined from the ex- 
periments. 

Hence the mere comparison of tests with one an- 
other affords us neither a clear insight into the neces- 
sary compensations, nor a decision as to the sympto- 
matic value of the testing; rather must we seek the 
means of guaging the tests in some criterion that lies 
outside of experiment. Such a criterion is supplied 
by the estimation of the pupils made by the teacher. 

It follows that this estimation of intelligence by 
the teacher thus comes to possess a methodological 
significance of its own, for it can be set up as a stand- 
ard for the comparison of other rank-orders, i. e., 
those obtained experimentally, only when we have 
first made sure of its own nature and its reliability. 
There are two ways of going about this; the one is 
by analysis of the procedure of the teacher when he 
does the estimating of his pupils' intelligence, the 
other is by determining objectively to what extent 
his estimation is dependent upon what he knows 
about the pedagogical rank-order of the children, 
their places in the class or their examination marks, 
etc. 

So, only when we have first discovered whether 
the estimated intelligence is a useful means of con- 
trol or under what precautions it is useful, can the 
real experimental problem be attacked: — the prob- 
lem of finding out those combinations of tests that 
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correlate to a high degree and with great uniformity 
with a reliable set of estimations of intelligence. 
This matter will be the subject of the three sections 
that follow. 

2. The Teacher's Estimation of the Intelligence of 
his Pupils 

The question whether a teacher is really able to 
estimate the degree of intelligence of his pupils is 
one that has no little importance even outside of our 
special and limited problem. It is surely practically 
worth while for the teacher, who is accustomed ordi- 
narily to pass judgments about his pupils primarily 
on the basis of their objective performance, to try 
for once to decide whether and to what extent a cer- 
tain capacity, namely general intelligence, is con- 
cerned in these performances. He will be obliged to 
study his pupils more carefully, to analyze their in- 
dividual disposition, and will perhaps come by this 
means to a better valuation of their work, to conclu- 
sions as to choice of curriculum, to advice as to the 
choice of vocations. 

The estimation of intelligence has an advantage 
over the experimental testing of intelligence in the 
fact that it is based upon longer and wider acquaint- 
ance with the pupil. For months the teacher has 
watched the behavior of a pupil in the oral and writ- 
ten tasks of different studies, has noted his ques- 
tions and answers, his interest or his indifference 
when dealing with many different subjects, his inde- 
pendence or his need of assistance in his work, and 
has seen as well his behavior when among his mates, 
in play in the school yard, on excursions, etc. He 
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has, then, a much broader range of information on 
which to base his judgment of the pupil's intelli- 
gence than has the experimenter, who has merely 
taken a half -hour's time to record the response of 
the child to a small number of tests. But, on the 
other hand, there are some great disadvantages in 
this method. The teacher does not, as a rule, stop 
to consider on what concrete facts of observation 
his judgment is based: the symptoms which de- 
termine his judgment are not controllable as regards 
their real importance; when the intelligence of sev- 
eral pupils is compared, the comparison is based on 
different facts that are not strictly comparable with 
one another. Finally, the teacher is often by no 
means clear as to what he is to understand by 'intel- 
ligence' when he makes his decisions. 

These considerations are enough to show that the 
estimation of intelligence is far from an easy matter, 
that it is not something to be demanded of every 
teacher as a matter of course. What is desired here 
is to seek the middle road between two opposite 
sources of danger: on the one hand we are threat- 
ened with the danger that the teacher can not free 
himself, when he estimates intelligence, from the 
pedagogical habit of judging his pupils by their 
schoolroom performance, in which event the rank- 
order for intelligence will be no more than a copy of 
the rank-order for class work, corrected in a few 
points. If the teacher tries to avoid this tendency, 
then there arises the other danger that he goes at 
the selection blindly and that the resulting rank- 
order becomes a mere product of chance. 

Evidently, the estimating of intelligence by the 
teacher makes contribution not only to the psychol- 



118 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE 

ogy of the pupils, but also to the psychology of the 
teacher. Teachers will vary a great deal in their 
capacity to undertake this work of estimating intelli- 
gence, so that for scientific investigations it becomes 
methodologically essential that tests shall not be com- 
pared with any sort of estimation of intelligence 
made by any sort of teacher, but that for this purpose 
teachers specially trained and specially gifted in 
psychology must he sought out and instructed spe- 
cifically in the nature of the task demanded of them. 
An entertaining as well as instructive inquiry on 
this subject has been made by Binet (36; 71). 

He sent to numerous elementary school teachers a questionary 
asking them to state : 1st, to what extent they thought that error 
might creep in when teachers sought to judge the Intelligence of 
their pupils, and 2d, what method they would pursue in order to 
arrive at an accurate estimate of intelligence. Binet soon saw 
that he had thus found an excellent scheme for classifying the 
intelligence of the teachers. 

The first question' was not very well put. The answers simply 
showed that there were some optimists who thought that they 
practically never made any mistake in appraising the intelligence 
of their pupils, while others were ready to admit that they might 
be mistaken in one case in every three. Not much else came out 
of this question. 

The answers to the second question were much more fruitful. 
The greatest variety of ideas concerning the nature of intelligence 
was exhibited: all possible attempts at defining it were made — all 
the way from the scholastic narrow-mindedness, which conceives 
intelligence as nothing but the capacity to acquire information, up 
to the neat formulation of one woman : 

"L'intelligence ne sert pas seulement 3, apprendre, elle sert 
surtout k 'faire sa vie.' " And the symptoms on which the teach- 
ers base their judgment of intelligence! It is declared that heed 
must be given to heredity, since higher intelligence is to be ex- 
pected from children of more intelligent parents. It is recom- 
mended to take note of the facial expression ; the more Intelligent 
child Is easily distinguished from the mentally lazy, dull child by 
his vivacious, open, mobile countenance. Some of the teachers lay 
stress on observation of their pupils during periods of free play, 
and would regard as intelligent children that displayed initiative 
and creative tendencies there. But the chief insistence is laid, as 
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would be expected, upon the behavior of the child under instruc- 
tion, and the attempt is made, with more or less success, to dif- 
ferentiate the strictly intellectual factors in the school work from 
the phases that depend more on mere memory : quickness of ap- 
prehension, ability to solve problems in applied mathematics, un- 
derstanding of historical movements and relations, good orthog- 
raphy, expressive reading and many other things are mentioned 
as symptoms that serve the teachers for estimating their pupils' 
intelligence. Finally, even the teachers themselves hit on the use 
of the method of tests by asking of their pupils certain questions, 
specially prepared for the piu"pose, the answers to which serve 
them as a measure for guaging intelligence. 

Binet then enters into a very careful criticism of 
these various ideas, calls attention to the strong and 
the weak features in them, shows that the "intelli- 
gence questions" invented for the express purpose 
are much less useful than the tests worked out by 
the psychologist after long years of experimentation, 
and comes to the conclusion that the estimation of 
intelligence by the teacher is not at all likely to 
render the exact testing of intelligence a superfluous 
process. It must be admitted that he might easily 
have given more emphasis to the reverse of this 
statement: he might have pointed out that the esti- 
mation of intelligence may possess advantages that 
are excluded on principle from the mere test, so that 
the estimation may therefore be indispensable as a 
supplement, and also in part as a control for the re- 
sults of tests.^ 



^his whole discussion of Binet's is couched in a light and 
sketchy vein, but is on that account one of the prettiest examples 
of the grace of his style and the pictorial clarity of his phrases. 
We may cite here just two instances that are so neatly put as to 
defy translation. Where he is speaking of the ease with which the 
possession of information may be misinterpreted as token of in- 
telligence, he says: "La mfimoire est la grande simulatrice de 
rintelligence." And the recommendation to watch children while 
engaged in play as well as in school is coupled with the declara- 
tion : "En classe lis sont denatures par la discipline." 
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We shall bring together next the chief require- 
ments that a teacher must keep in mind when he im- 
dertakes an estimation of intelligence if it is to be 
usable for scientific purposes. 

He should conceive of intelligence, just as we de- 
fined it at the beginning of this monograph, as "gen- 
eral mental adaptability to new problems and condi- 
tions of life," should give particular heed to the two 
attributes "general" and "adaptation to the new," 
and should guard against identifying with intelli- 
gence any sort of special ability or the mere posses- 
sion of information or readiness in speech. Because 
of the general nature of intelligence it is essential to 
take into consideration the way in which the child 
behaves in quite different situations and when con- 
fronted by problems of varied sorts. 

But, now, it is of the essence of the estimation of in- 
telligence that not only shall each pupil be judged in- 
dividually, but he shall also be compared with the 
other pupils and be placed in a definite relation of 
equality or inequality with them. For this purpose 
of comparison there is to be laid down a rule that, de- 
spite its fundamental character, has by no means al- 
ways been observed: Only those pupils shold be lo- 
cated in a given rank-order of intelligence that are 
sufficiently like one another in other respects. The 
reason is that the slight differences of intelligence that 
have to be considered in making the estimation have 
significance only when on the common basis of an 
otherwise homogeneous group. For this reason the 
comparative estimation of intelligence has usually 
been restricted to the pupils of a single class ; but it is 
necessary as well to be careful to secure homo- 



ESTIMATION AND TESTING OF FINEE GRADATIONS 121 

geneity witliiii the class, not only by excluding chil- 
dren who are plainly abnormal, but also by limiting 
the estimation to a definite range of ages. If, for in- 
stance, in the 5th school year, normally corresponding 
to ages 10-11 years, the class contains some 13-year- 
old boys, they should not be included in a rank-order 
of intelligence, for the teacher is not in a position to 
determine what portion of the intelligence that they 
exhibit is to be ascribed to their greater age: that 
would have to be deducted even if they were com- 
pared to 11-year-old children. Hence, you must first 
sort out any class within which you would undertake 
an estimation of intelligence. It is impossible to lay 
down any hard and fast rule for this : the range of 
ages that should be included depends on different 
circumstances ; a greater latitude of age is permissi- 
ble in maturer than in the younger years. In gen- 
eral, the calculations that I shall speak of below in- 
dicate that some 20 to 25 per cent, of the members of 
every class must be excluded. 

The arrangement of the pupils on the basis of their 
intelligence can be by groups or in a serial order. The 
first of these arrangements is far easier for the 
teacher. In fact, he is accustomed to classify almost 
all things that are judged in terms such that four to 
. six classes are formed ; and a similar schema can be 
applied to intelligence — as, for example, in the form : 
I very high intelligence, II good. III medium, IV 
slight, V very weak intelligence. Thus, Pearson and 
those of his followers who have made use of estima- 
tions of intelligence in their statistical investigations 
of school children have been content with groupings 
of this sort (74, 76, 81). 
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Personally, I do not find mucli to recommend in 
the method. Even though most of the pupils can be 
located without difficulty in one of the five groups, 
there still remains a fairly large number of pupils 
with whom the teacher remains in doubt as to which 
of two adjacent groups they may belong to. The 
final decision is then an arbitrary one, which, if often 
enough repeated, can destroy the value of the whole 
distribution. If, for instance, three such doubtful 
cases are assigned to Group II the proportioning of 
cases and the resulting correlations are quite differ- 
ent from what they would be if these cases had been 
assigned to Group III, where they might just as 
properly be placed. It follows that so small a num- 
ber of groups as this is entirely inadequate for the 
most important problems of correlation. Many of 
our tests yield finely graded rank-orders of pupils, 
and so it is to be desired that the estimated intelli- 
gence with which these tests are to be related should 
also take the form of a rank-order. 

The construction of a rank-order of all pupils 
based on estimations of their intelligence in this way 
naturally presents many difficulties. Many persons 
hold it as a matter of course to be quite impossible. 
But experience has shown that the task can be done. 
The division into groups that we have just dis- 
cussed can, indeed, be carried out as a preliminary 
step ; then the pupils within each of the groups must 
be further arranged in a scale, so far as practicable. 
At the limits of the groups, however, care must be 
taken, for it has been shown many times that shifts 
will have to be made at these points, as, for instance, 
a child originally assigned to Group II has to be 
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placed after the children lying in the upper section 
of Group III. 

But, on the other hand, the principle of rank-ar- 
rangement must not be carried so far as to insist on 
assigning a precise place to every child at any cost. 
Often enough, especially in the middle region, it will 
be felt to be an arbitrary matter to give N a poorer 
place than M, because it will have been impossible 
to arrive at an unequivocal judgment as to the differ- 
ence in the worth of the intelligence displayed by the 
two children. The rule for such cases is to give the 
same rank-number to individuals of equivalent abil- 
ity, using the number that corresponds to the aver- 
age of the places that they occupy. Thus, if four 
individuals that would have occupied the stations 
5, 6, 7 and 8 seem of equal intelligence, each one re- 

5+6+7-f8 

ceives the rank-number 6.5, i. e., =6.5. 

4 

In case this process has to be followed repeatedly, 
the number of rank-differences that are at our dis- 
posal is reduced, but this disadvantage is more than 
compensated for by the advantage of avoiding arbi- 
trariness in the arrangement. It is no misfortune 
if no more than 20 or even a dozen different rank- 
numbers are forthcoming in the ranking of a class of 
30 pupils. 

We shall be obliged to dwell somewhat longer on 
the matter, already broached, of the dependence of 
the series of estimations on the pedagogical rank- 
order. The greater the role played by the school's 
rank-order in the ordinary management of the class, 
the greater will be this dependence. But there pre- 
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vail decided differences with regard to the class- 
marks. The plan, which was formerly quite gener- 
ally followed, whereby each pupil had a "class- 
place," which determined his rank among his class- 
mates for a quarter of a year, is now becoming less 
and less common. Sometimes the statement of 
standing is accompanied by a designation of place in 
the class, e. g., "promoted as 15th in a class of 27," 
without laying any further special stress on the 
ranldng ; sometimes even this designation is lacking, 
so that there exists no positive school rank-order at 
all. Of course, even where there is a school rank- 
order, it is to be desired that the teacher work out 
his estimation of intelligence as far as possible inde- 
pendently of this school ranking. On this account 
it is in every way objectionable to proceed, as is often 
done because it is the easiest way, by taking the 
school ranking as a starting poiat and simply shift- 
ing the position of those children whose rank in this 
list has been displaced by some special circumstance, 
like illness, transfer of school, evident laziness, etc. 
Burt, for example, worked in that way. To be sure, 
such a "corrected school ranking" is doubtless bet- 
ter than an uncorrected one for psychological pur- 
poses, but it by no means presents a correct ranking 
of intelligence. 

To secure as impartial a riankuig of intelligence as 
possible, the following procedure may be recom- 
mended. Write the names of each of the pupils to 
be ranked on a separate card, and arrange these first 
in alphabetical order. Then and only then sort out 
the cards into different groups for their intelhgence, 
and finally try to settle upon rankings within each 
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group. The series thus secured is then noted down 
with the proper number for each individual. 

It is an excellent plan to do the work all over again 
after an interval of perhaps three or four weeks, and 
without referring to the series first obtained. The 
degree of correspondence between the two estima- 
tions is then to be determined by the correlation 
method. Only if the coefficient of reliability is high, 
i. e., if the two series are very similar to one another, 
should they be made the basis of further investiga- 
tions. If they are then to be used, it is best to con- 
struct an amalgamated series out of the two estima- 
tions by taking for each pupil the mean of the two 
rank-numbers that he has obtained and bringing to- 
gether these means for the new rank-order. 

Distinct differences in method will appear when 
the estimation of intelligence is made in elementary 
schools (Volksscliule) than when in higher schools. 

The elementary school teacher has the particular 
advantage that he is usually the only teacher of the 
class, and thus can observe the behavior of the chil- 
dren in the most varied activities, in technical and 
theoretical subjects, at play and at work. But just 
this very breadth of information also renders him in 
a certain sense less independent in his estimation. 
Because his knowledge of his pupils extends in so 
comprehensive a manner over their school perform- 
ances; all their grades and the determination of 
their class-places are the product of a single teacher, 
so that it is psychologically easily intelligible that he 
can not so easily free himself from this judgment 
that he has himself worked out, even when he under- 
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takes the entirely different problem of estimating 
their intelligence. 

With the teacher in the higher schools the situa- 
tion is different. He instructs only in certain sub- 
jects and thus comes to know his pupils only par- 
tially. This certainly renders the estimation of their 
intelligence difficult. The departmental teacher 
must especially guard against identifying special 
talent or lack of talent in his special subject with 
general intelligence or the lack of it. Yet he is less 
biassed in his construction of a rank-order by the 
same circumstances, in that the rank-order of the 
school, if there be one, is never his own work that 
might affect his judgment by auto-suggestion, but is 
something that has been obtained by the mere me- 
chanical addition of all the different performances 
of the pupils, including performances in other sub- 
jects with which he has nothiug to do. And this en- 
tails a further advantage, viz., that estimations of 
the intelligence of the same group of pupils can be 
obtained from the different departmental teachers 
who instruct them, and that these estimations can 
then be compared with each other and finally amal- 
gamated into a single series. Of course, for the pur- 
pose of making a comparison of this sort only those 
teachers should be drawn upon whose subjects of in- 
struction can afford a basis for a fairly exact knowl- 
edge of the pupils — ^not, then, some teacher who 
might give iastruction to the class in some minor ac- 
cessory subject only. 

It is evident, also, that teachers can be asked to 
make estimations of their pupils' intelligence only 
after they have become intimate with them — ^not, for 
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instance, right after the beginning of the school year. 
Teachers that have accompanied a class and known 
them more than a single year supply especially 
favorable conditions for the work. 

3. Estimated Intelligence and School Performance 

These theoretical considerations may novi be illus- 
trated by a series of quantitative results that bear 
on the relation between estimated intelligence and 
school performance. I shall make use of some al- 
ready published material of English origin and also 
of some as yet unpublished material that has been 
gathered as opportunity offered by members of the 
psychological department at Breslau. 

In the English investigations (Table XV) the 
school performance has been measured in different 
ways; in some there were used the class-places, in 
others the results of school examinations, which are 
held regularly in all classes in England. Unfortu- 
nately, the special methodological measures that 
were taken in securing the estimated uitelligence are 
not precisely enough reported to permit us to pass 
any judgment concerning the reliability of the re- 
sults. 

In all cases there are clear, and in some high cor- 
relations — decidedly higher, it is to be noted, be- 
tween intelligence and the results of examinations 
than between intelligence and class-place (0.76 as 
compared with 0.68). This result is not without in- 
terest. So far as we may judge from the account 
of the investigation, the estimation of intelligence 
had been undertaken without the results of the ex- 
aminations having been known — indeed, in some 
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cases before these had taken place. The estimates, 
then, were not affected by the ranking in the school 
tests, and the rather high correlation could therefore 
be regarded as a reliable expression of the degree of 
correspondence between intelligence and the work 
done in the examinations. Burt also had estimates 
of intelligence of the same pupils made by different 
teachers and by disinterested school-mates of the 
pupils. The correlations between these estimates 
are very high, but since all those who estimated set 
out from the already known rank-order of the pupils, 
which they had merely to correct, it follows that this 
high correspondence is nothing remarkable and that 
it has no scientific value. 

In the course of discussions of this subject in the 
meetings of the Psychological Seminary at Breslau, 
during the winter semester 1911-12, the need became 
evident of clearing up the methodological aspects of 
the whole subject of estimating intelligence by trials 
of our own. Fortunately, two of our members, who 
were engaged in practical school work, were ready to 
secure new material.' The results thus obtained are 
worth noting because they very clearly demonstrate 
the methodological difficulties and the way to over- 
come them and also bring out the necessary differ- 
ence between the procedure in secondary and in ele- 
mentary schools. 

Principal Eindfleisch had the teachers in charge 
of a boys' Volksschule prepare for their classes lists 
that showed both the ranking of the pupils on the 

^Hearty thanks are due to Principal Rindflelsch (Liegnitz) and 
Dr. Scheifler (a high-scliool teaclier at Gorlitz) for their great 
pains and for their courtesy in placing the material at my disposal. 
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basis of their performances and also their ranking 
according to their intelligence. So far as it has 
proved feasible I have calculated the rank-correla- 
tions of these lists. It must be stated that quite a 
number of the lists had to be excluded ; some because 
the teacher had been satisfied to present the material 
arranged in a very few intelligence-groups, and some 
because the necessary precautions of method had 
plainly not been observed. Thus, there were many 
lists that showed plainly that the rank-order for 
school performances had been arranged first and 
then the rank-order for intelligence had been ar- 
ranged from it with only a very few corrections. 







TABLE XVI 
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School 




Number 


Number 




Class 


Year 


Age 


Tested 


Omitted 


e p- E. 


Via 


1. 


6.3-7.6 


47 


8 


0.85 + 0.05 


VIb 


1. 


6.0-7.1 


37 


16 


0.78 + 0.07 


Vb 


2. 


7.7-8.7 


f 40 
1(34) 


7 
(13) 


0.47 -1- 0.1 
(0.74 -H 0.08) 


IVa 


3. 


8.3-10.0 ' 


45 


13 


0.87 + 0.05 


Ilia 


4. 


9.3-11.6 


43 


11 


0.88 + 0.05 


Ila 


5. 


10.3-12.2 


30 


14 


097 + 0.03 


la 


6. 


11.6-13.6 


30 


12 


0.91 -t- 0.05 



The remaining lists, however, cover all the differ- 
ent school grades. The important data are shown ia 
Table XVI. There it will be noted that in figuring 
the correlation I left out in each class a number of 
pupils whose age exceeded the proper limits. If we 
leave Class Vb out of consideration for the moment, 
the correlations are then uniformly markedly high — 
between .78 and .97; average without Class Vb = 
0.88. The fact that the correlation is higher than the 
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English correlation between estimated intelligence 
and class-place is doubtless due to the circumstance 
that the English -were content to use a small number 
of classificatory groups of intelligence, whereas in 
our lists serial ranking was required. This more dif- 
ficult task brought out a somewhat higher depend- 
ence on the school ranking, which was known by the 
teachers and iu fact prepared by them. Hence this 
very high correlation is not a proper expression of 
the actual degree of connection between intelligence 
and school work, as will be seen more clearly upon a 
closer analysis of the lists. There are certain symp- 
toms by which one can tell very positively whether 
the teacher has or has not made the attempt to free 
himself from the suggestive influence of the school 
ranking; and the more seriously this attempt was 
made, the smaller was the correlation. 

Mention must be made in this connection of the 
Class Vb, whose teacher plainly undertook the work 
with great independence and with fine psychological 
comprehension. This teacher settled the numbering 
for his intelligence series without glancing at his 
school-work series and sought to explain the cases 
of special discrepancy between school performance 
and intelligence by brief remarks ("moved in from 
the country, " " sick a long time, ' ' "poor home condi- 
tions, ' ' etc. ) . The result was astonishing — a correla- 
tion of only 0.47. 

This one correlation is, in my opinion, psycholog- 
ically and methodologically more important than the 
much higher ones obtained for the other classes, be- 
cause the lack of higher correlation is certainly due 
not to any peculiar composition of the Class Vb, but 
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to the special care and capacity for judging of the 
teacher who did the estimating. 

I have made a further calculation for this Class 
Vh by excluding those sis pupils in whose cases there 
existed, according to the teacher's notes, special con- 
ditions. The correlation for the remaining 34 then 
rose at once to 0.74 ; in other words, it approximated 
very closely the lowest correlations computed for the 
other classes. From this it follows for this particu- 
lar class, and presumably as a general principle, too, 
that the low correlation first secured is not due to 
any distinct thorough-going discrepancy between de- 
gree of intelligence and school efficiency, but rather 
to an unusual discrepancy between endowment and 
performance in a minority of the pupils. This small 
group demands the special consideration of the 
teacher and individual treatment, for it is with them 
that the danger is greatest that the ordinary valua- 
tion of the children in terms of their school work 
may lead to an erroneous appraisement and han- 
dling. 

Turning to the higher schools, I have at my dis- 
posal now a single class only, but the estimation of 
the intelligence of this class has special value on ac- 
count of the great thoroughness and precautions of 
method adopted, and on account of the fact that sev- 
eral teachers joined in estimating the same children. 
Table XVII exhibits the correlations that I have 
computed. I have to thank the regular master of the 
class for the material. 

The class was an Vntertertia grade in a Gymna- 
sium.* The regular teacher (Teacher A) was well 

This would correspond scholastically approximately to our first 
high-school year. — Translator. 
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trained psychologically and as a member of my semi- 
nary understood perfectly the things to be kept in 
mind in estimating intelligence. As he had already 
taught these pupils the year before and had given 
them during the current year ten hours of instruc- 
tion a week (Latin and French), it may be taken for 
granted that he possessed a really exact acquaint- 
ance with the material before him. It can therefore 
be said that his estimation of their intelligence was 
made under specially favorable conditions. More- 
over, he had two other teachers estimate the same 
pupils. Of these, Teacher B, it is true, instructed the 
class only two hours a week in history and Teacher 
C four hours a week in religion and German. The 
instructions were to judge the children not on the 
basis of the particular ability that they might have 
displayed in the subjects the teachers taught, but on 
the basis of the impression of their general intelli- 
gence. Beside these three estimated orders there 
was also available the series of 'class-places' of the 
pupils. In making my calculations I excluded eight 
pupils who were too old. There remained 23 — 
enough to permit the reckoning of valid correlations. 
Now a first glance at the table shows that the correla- 
tion between intelhgence and class-place is much 
lower than those obtained in most of the elementary 
school classes. The specially reliable estimation of 
Teacher A gives a correlation of 0.43. Of those for 
the two other teachers, the one is somewhat higher, 
the other somewhat lower. When we combine the 
estimates of all three teachers into an amalgamated 
estimation-series, we obtain again a correlation with 
the class-place of 0.45 — a value that coincides almost 
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exactly with that of the single elementary class Vb 
that we have accorded special treatment. Hence it 
appears that when an estimation of intelligence is 
made with special thoroughness and caution, there 
exists only a moderate degree of correlation between 
it and school efficiency. 

TABLE XVII 

Class: U III of a Gymnasium (7th school year). 

Ages of those investigated : 13.5 to 14.5 years. 

Number investigated: 23 (8 others omitted as too old). 

Number of teachers estimating: 3 (Teacher A the principal 
teacher). 

Correlations between estimated intelligence and class-place ; 

Teacher A and Class-Place 0.43 -+■ 0.13 

Teacher B and Class-Place 0.55 -t- 0.12 

Teacher C and Class-Place 0.33 + 0.14 

Teachers B and C (combined) and Class-Place 0.49 + 0.13 

Teachers A, B and C (combined) and Class-Place 0.45 -+- 0.13 

Intercorrelations of the Estimations: 

Teacher B and Teacher A 0.69 -h 0.10 

Teacher C and Teacher A 0.65 -j- 0.12 

Teachers B and C (combined) and Teacher A 0.75 -j- 0.10 

The values obtained for all three of the secondary 
school teachers alike show that it is much easier for 
the individual teacher in the secondary school to rid 
himself from the influence of the class arrangement, 
because this arrangement has not been determined 
by himself alone. 

The reliability of the result is augmented by the 
intercorrelation of the series of the teachers. These 
correlations are, in fact, much higher : highest (0.75) 
when the estimates of the two supplementary teach- 
ers are combined and related to the particularly 
trustworthy estimate of the regular class-teacher. 
That is to say, then, the estimation of intelligence un- 
dertaken by the teachers quite independently of one 
another exhibit a great similarity to one another, 
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despite the fact that the several teachers derived 
their judgments from observations in quite different 
school subjects. This decided correspondence of the 
judgments of the teachers on their pupils' intelli- 
gence, taken in conjunction with the fair degree of in- 
dependence of their judgment from the class-place, 
seems to me to be a forcible argument for the scien- 
tific usefulness of the method of intelligence estima- 
tion. But the result also teaches us, when we com- 
pare with it the experience gained in the elementary- 
school, that only such estimations of intelligence are 
useful as have been carried out by an exact method 
and with special psychological knowledge. 

4. Rank-orders of Intelligence obtained by Tests 

We can now return once more to the starting point 
of this whole section, the experimental testing of in- 
telligence. For we may safely regard the estimation 
of intelligence by the teachers, when undertaken with 
the necessary precautions, as a suitable control-de- 
vice by which we can measure the reliability of 
experimental testing. 

The material just now available on the correlation 
between rank-orders obtained by tests and those ob- 
tained by estimations is, to be sure, very scanty, yet it 
is already enough to indicate the direction in which 
greater results are to be looked for. Here, too, do 
we come upon that principle that we found univer- 
sally applicable in intelligence testing : no single test 
of whatever kind, but only a skillfully combined sys- 
tem of tests yields a reliable gradation of intelligence. 

Burt in England and Bies in Germany have car- 
ried on with normal children investigations pertain- 
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ing to this field. Burt deals with but a small number 
of cases: — one group of 30 'elementary' school- 
pupils and another of 13 ' secondary' pupils ; Eies has 
investigated five classes in an elementary school. 
The very much more extensive and precise investiga- 
tions of the Breslau teacher, Hylla, have unfortu- 
nately not yet been completed. 

Burt (72) tested his classes with 12 different tests. 
The rank-orders obtained for the different tests show 
quite different correlations with the estimated rank- 
order — six tests over 0.50, six tests under 0.50. The 
tests that show the higher correlations are mostly 
those that pertain to attention, motor skill and mem- 
ory. These tests and their correlations are shown in 
Table XVIII. On the other hand, the tests of dis- 
criminative sensitivity uniformly show very low cor- 
relations with intelligence — a result worthy of note 
because there still prevails a tendency in many quar- 
ters to use sensory tests for testing intelligence. 

table xviii 
hurt's experiments with normal children 

r-Correlation with-> 

Est. Intell. 
Elem. Secndry 

Test School School 

1. Dotting. (A zig-zag row of dots traveling 

at constant speed must be hit with a 

pencil) 0.60 0.84 

2. Spot pattern. (A group of dots to be re- 

produced by drawing after 5 exposures 

in a tachistoscope) 0.76 0.75 

3. Mirror. (A pattern visible only in a mir- 

ror is to be pierced at marked points) . 0.67 0.54 

4. Memory span for concrete and abstract 

words and nonsense syllables 0.57 0.78 

5. Alphabet. (Cards with the letters of the 

alphabet are to be properly arranged) . 0.61 0.80 

6. Sorting (50 playing cards of 5 different 

colors are to be sorted Into 5 packs) . . . 0.52 0.56 

Resulting rank-order for all 6 tests 0.85 0.91 
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The correlations found by Burt are in general 
somewhat lower for the elementary than for the 
higher school, but no particular value is to be 
ascribed to the higher correlations on account of the 
small number, 13, of subjects in the second group. 

Ries° used two methods: Method A is patterned 
after the Ranschburg method of word-pairs; each 
word-pair is comprised of two words that stand in a 
causal relation to one another, e. g., 'hunger' — 'weak- 
ness.' Each pair was pronounced and their reten- 
tion tested by the method of right associates. Method 
B was in the form of an association experiment: to 
each word pronounced there was to be given as a 
response a word whose meaning stood in the relation 
of effect to cause with the stimulus word. In both 
methods the plan was to bring intelligence into action 
by the use of logical relations. And in fact the re- 
sults did furnish a very high correlation with esti- 
mated intelligence and with a small probable error, 
viz. : with Method A 0.59, 0.85, 0.89, 0.86 and 0.90 (in 
the different classes) and with Method B 0.85, 0.94, 
0.86, 0.91. 

A supplementary test undertaken for comparative 
purposes by means of the Ebbinghaus completion 
method gave in two classes somewhat smaller cor- 
relations. 

Ries' results doubtless show that the methods that 
he proposes may lay claim to a place in a system of 
tests for securing rank-orders of intelligence. On 
the other hand, it should not be concluded from the 



"Reference 78. See also the extensive critical review of Ries by 
Bobertag, Zeits. f. angew. Psych., 5, p. 207. 
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high correlations that Method A or Method B taken 
alone are adequate for testing and ranking intelli- 
gence. For, in the first place, Eies' results do not 

TABLE XIX 

Rles' experiments on 24 boys (MittelscJiule, 2(1 class, ages 12-14). 

Test A : Method of word-pairs and right associates. 

Test B : Association of effect to a given cause. 

Correlation of Test A with Test B 0.61 

Correlation of Test A with Estimated Intelligence 0.85 

Correlation of Test B with Estimated Intelligence 0.94 

Correlation of Tests A and B (combined) with Est. Int 0.98 

present the requisite uniformity (in one class, 
Method A correlated with estimated intelligence by 
only 0.59), and it is very questionable whether repe- 
tition of the tests in other places would furnish the 
same high correlations. Again, each of his methods 
tests only one phase of intelligence, and a comparison 
of the two methods with one another shows how little 
right we have to infer one phase from the other. 
Thus, Eies gives for one class a table of the original 
data from which I have been able to calculate some 
results not mentioned by him (Table XIX). The re- 
sult is that the two methods do not correlate at all 
highly with one another, only 0.61; in other words, 
the ranking of intelligence by Method A furnishes a 
distribution of stations that is in some parts 
quite different from the distribution furnished by 
Method B. 

The example is, however, excellently adapted to 
point out the way toward the method that is to be ap- 
plied. 

What does it mean that both tests correlate so high 
with estimated intelligence, but so low with one an- 
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other? Plainly, this is possible only when the rank- 
orders obtained by the tests deviate from the rank- 
order obtained by estimation in contrary directions 
in some portions of the series. 

In illustration : if a pupil obtains Station 10 in the 
estimated rank-order, Station 8 in the order of Test 
A and Station 12 in the order of Test B, and if a simi- 
lar thing occurs with other pupils, then the above- 
mentioned differences in the correlations follow of 
necessity. But it is equally evident that the combin- 
ing of the two stations for tests, 8 and 12, gives the 
so-called "resulting rank-place for tests," 10, a value 
that now coincides with the station for estimated 
rank-order. The two tests therefore mutually com- 
pensate one another and thus form, when combined, 
a measure of intelligence that comes much nearer 
the estimated intelligence than either test by itself. 
Put psychologically, the tests demand the activity of 
aspects of intelligence so different as to be very un- 
equally developed in one and the same person, but 
which, taken together, do characterize his degree of 
intelligence. 

And, as a matter of fact, the correlation computed 
from Eies' data did figure out so that the amalga- 
mated rank-order for the two tests presents the ex- 
traordinarily high correlation with estimated intelli- 
gence of 0.98. 

In this way, then, the mutual compensation of tests 
that we have already set forth as a requirement, be- 
comes a controlling principle of the test-series, and 
the correlation method gives us a numerical device 
for discovering that combination of tests in. which we 
approach most nearly to perfect compensation. I 
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mean that we must combine together tests that cor- 
relate less with one another than each one of them 
correlates with estimated intelligence, and that com- 
bination whose amalgamated rank-order shows the 
highest and most constant correlation with estimated 
intelligence is the system of tests that we seek. Nat- 
urally, we shall not limit ourselves to two tests in 
making our system, but shall combine a larger num- 
ber into one compensation-system. 

This was the idea that incited Hylla to the investi- 
gations previously mentioned which are still in prog- 
ress. The idea of compensation as a principle in as- 
sembling tests has already arisen simultaneously 
both in England and in France. 

Thus, from the tests that had afforded the highest 
correlations with estimated intelligence in his in- 
vestigations, Burt worked out an amalgamated rank- 
order whose correlation with estimated intelligence 
considerably exceeded all the single correlations 
(Table XVIII). In the elementary school the single 
correlations ranged between 0.52 and 0.76, that of 
their combination amounted to 0.85; in the higher 
schools the single correlations ranged between 0.54 
and 0.84, while the correlation for the combined tests 
rose to 0.91. From this Burt draws the conclusion 
(pp. 158-9) : "By means, then, of some half-dozen 
tests, we are able independently to arrange a group 
of boys in an order of intelligence, which shall be de- 
cidedly more accurate than the order given by 
scholastic examination, and probably more accurate 
than the order given by the master, based on personal 
intercourse during two or three years, and formu- 
lated with unusual labor, conscientiousness and 



ESTIMATION AND TKSTING OF FINER GRADATIONS 141 

This conclusion sounds extremely optimistic in- 
deed, since the material that Burt had at his disposal, 
43 subjects, does not in the slightest degree suffice 
for the formulation of such a thesis. However, the 
principle that it embodies is so promising and so il- 
luminating as imperatively to demand a thorough re- 
testing by the most exact methods and on a very ox- 
tensive scale. 

An analogous result has been found also with 
feeble-minded children. Mile. Descoeudres (73) 
tested 14 children in an institution by means of 15 
different tests. It is true that the children differed 
very greatly in age (from 6.5 to 14 years), yet it was 
possible to estimate their intelligence by the general 
impression that they made in the house and in the 
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TABLE XX 
DESCOBUDBES : EXPEBIMENTS ON FEEBLE-MINDED CHILDREN 

^Correlat'ns with Bst.Intell.^ 
of all 
of each of each the tests 

Tests test 5 tests , combined 

Comparison of terms 0.878 

Computation 0.868 

Describing pictures 0.842 >■ 0.91 

Problem-questions 0.817 

Tactual discrimination 0.812 

Definitions 0.801 

Stringing beads 0.780 

Inventiveness (a picture is 

shown : what are the persons I o 84 !- 0-99 

in it talking about?) 0.761 ^ " 

'Patience' (restoring a cut-up 

picture) 0.734 

Knowing four coins 0.699 

Attention (cancelling a's) 0.671 

Visual memory (5 objects) 0.646 

Noting omissions in drawings. 0.637 V 0.73 

Auditory memory (5 words) . . 0.539 

Naming 60 words in 3 min 0.509 
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schoolroom and to arrange them in order on this 
basis. The first column of figures in Table XX shows 
the correlations of the several tests with the esti- 
mated rank-order. The correlations are arranged in 
order of their magnitude and run from 0.88 to 0.51. 
Mile. Descoeudres also calculated the amalgamated 
rank-order for all the tests and found a correlation 
of 0.99 between it and estimated intelligence — almost 
complete correspondence, then, between the two se- 
ries. I have not myself checked up this value to see 
if it is absolutely correct, but I have from the original 
data calculated the correlation with intelligence for 
each 5 tests, taken in combination (Column 2) and in 
each case I found confirmation of the rule that the 
amalgamated correlation was considerably higher 
than the highest correlation of any single one of the 
tests of which it was compounded. 

A few hints may be added concerning certain other 
points to be observed in working at the problem of 
ranking by tests. 

(a) Measurability. It must be possible to ex- 
press the performance in the test conveniently and 
unequivocally by a numerical value : and these numer- 
ical values must make sufficient differentiation with- 
in a group that a rank-order of performance can be 
drawn up. 

(b) Reliability. A test is reliable only when its 
outcome is a true expression of abilities and is not too 
much affected by variable and temporary conditions. 
Eeliability is tested by applying the same (or an 
analogous) test several times to the same group of 
subjects. Only if these repeated testings show a high 
degree of intercorrelation is the test valid scientific- 
ally. 
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(c) Fairly high correlation, even of the single 
test, with estimated intelligence. Because tests that 
of themselves exhibit little or no relation to intelli- 
gence can not, of course, gain symptomatic signifi- 
cance for intelligence by combination, however many 
of them are combined. 

(d) Comprehensiveness of the tests, and that in 
two directions. First, we should endeavor to bring 
into action the different functions concerned in intel- 
ligence (see above, pp. 20 f.). Secondly, we should 
take care that the numerical records refer not only to 
quantity, but also to the quality of the performance, 
e. g., both to the number of units accomplished in a 
given time and also to the percentage of errors made 
during the work. 

(e) "We should see to it that the estimation of in- 
telligence be done thoroughly and conscientiously. 

(/) When a considerable number of tests have 
been carried out upon a group, then combine the re- 
sults into different amalgamated rank-orders imtil 
that combination has been found that yields the 
strongest correlation with the estimated intelhgence. 
The combination should then be tested out on other 
groups. 

The construction of an amalgamated rank-order is 
very easy. The ranks obtained by each subject in the 
several tests are combined into an average value. 
These average values themselves do not form the se- 
ries desired, but must first be revised into ordinal 
numbers that represent the final rank-order. 

Example : The pupils have been tested in three tests. The best 
pupil has obtained in the three trials the rank-places 3, 1, 1, the 
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second-best the places 1, 2, 4, the third the places 2, 4, 2, etc. The 

3+1+1 1+2+4: 

averages are for Pupil A ^^ — = 1.67, for B ^ g' = 2.33, 

2+4+2 
for C o — = 2.67. Hence, in the final or amalgamated series, 

A receives the Rank-place 1, B Place 2, C Place 3. 

If we proceed in this manner we may, I think, ex- 
pect that the method of amalgamated ranks can be 
worked out into a systematized plan of procedure, as 
has already been done with the method of age-levels. 

Not until we combine both these ideas can we hope 
to master the whole field of intelligence testing. The 
system of levels draws the great wave-lines of mental 
development: the method of ranking sketches the 
finer ripples within each level, and tu such a manner 
that the precise evaluation of the degree of intelli- 
gence of the individual child shall be possible. At the 
same time, the purely psychological analysis of the 
behavior of the subject toward the test must not be 
neglected, because it supplements the quantitative de- 
termination of intelligence by making it possible to 
ascertain the qualitative 'coloring' of the intelligence 
in the individual case. 
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APPENDIX I. 

Example of the Computation of a Correlation. 

Correlation between 'class-place' (x) and a teacher's estimate of 
intelligence (j/). 

62 (x — yy 

Index of correlation ^ p = 1 . 

n («' — 1) 

1 
Probable error = P. £?. = .706 - 

Vn. 

(to = 23 pupils in Untertertia — grade entered at about 12 years.) 
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